C11中的Unicode

在C11（ISO/IEC 9899:2011）标准中引入了对UTF8、UTF16以及UTF32字符编码的支持。

其中，UTF8字符直接通过char来定义，字面量前缀使用u8。比如：

char c = u8‘你‘;
const char *s = u8"你好";

而UTF16字符直接通过char16_t来定义，字面量前缀使用u。比如：

#include <uchar.h>

char16_t c = u‘你‘;
const char16_t *s = "你好";

而UTF32字符直接通过char32_t来定义，字面量前缀使用U。比如：

#include <uchar.h>

char32_t c = U‘你‘;
const char32_t *s = U"你好";

在使用char16_t以及char32_t的时候必须包含头文件<uchar.h>。除此之外，C11标准中还添加了诸如wsprintf、wfprintf、vwprintf、wprintf等宽字符函数。不过这些函数的字符串都是const wchar_t*类型的，即宽字符指针类型。而对于Unicode字符的显示是各家平台自己实现的。在OS X以及iOS中，至今（Apple LLVM 6.0）还没完美地支持这一C11特性，但是UTF8、UTF16以及UTF32字面量都已经支持了，尽管系统本身不支持对UTF32编码格式的解析。另外，也没有包含<uchar.h>头文件。不过，我们可以使用Foundation库自带的unichar类型来代替char16_t。另外，printf函数不支持对UTF16编码字符的打印，若要打印UTF16字符或字符串，只能用Foundation里的NSLog函数。

下面举些例子：

#include <stdio.h>
#include <wchar.h>

- (void)viewDidLoad
{
    [super viewDidLoad];
    // Do any additional setup after loading the view, typically from a nib.

    const char *s = u8"你好，世界！";
    printf("此UTF-8字符串为: %s\n", s);

    unichar ch = u‘你‘;
    const unichar *us = u"好，世界！";
    NSLog(@"该UTF16是：%C%S", ch, us);

    wprintf(L"iOS does not support for printing wide-character unicodes!\n");
}

在NSString字符串格式中，%C对应类型为unichar（实际为unsigned short）的UTF16编码字符；%S对应类型为const unichar*，即UTF16编码的字符串。

时间： 2024-10-07 06:32:28

C11中的Unicode

C11中的Unicode的相关文章

对于C11中的正则表达式的使用

C#中文和UNICODE编码转换

python将dict中的unicode打印成中文

Python中的Unicode编码和UTF-8编码

python中文和unicode字符串之间的互相转换

web前端-常见中文字体在CSS中的Unicode编码

Wpf中显示Unicode字符

从ord()中对Unicode编码的理解

编码对象或者字串中包含Unicode字符怎样转换为中文