c++读入之 -- 汉字读入遇到的问题

好吧，课题和汉语处理有关，于是就要求用c++来读入汉字进行处理。

首先使用wchar_t字符即宽字符，然后这样定义：

 1 #include <cstdio>
 2 #include <cwchar>
 3
 4 using namespace std;
 5
 6 int main() {
 7     wchar_t* ch;
 8     scanf("%S", ch);
 9     printf("%S", ch);
10     return 0;
11 }

然后结果是什么呢？没输出、、、

于是我们改一改：

 1 #include <cstdio>
 2 #include <cwchar>
 3
 4 using namespace std;
 5
 6 int main() {
 7     wchar_t ch;
 8     ch = getwchar();
 9     printf("%C", ch);
10     return 0;
11 }

还是没有输出。我试着改成了"printf("%d", ch);"，于是输出了-1

这表明，即使是宽字符，也是按照char的宽度读进去的，也即，wchar_t是一个至少到现在没什么大用处的字符类型？

怎么办呢？czr大爷告诉我其实可以这样写：

 1 #include <iostream>
 2 #include <string>
 3
 4 using namespace std;
 5
 6 int main() {
 7     string st;
 8     cin >> st;
 9     cout << st << endl;
10     return 0;
11 }

string类里实现了写高端的东西，反正就是可以读入输出中文。

然后我表示不爽。。因为cin、cout太慢了，于是我们想到了一些奇怪的东西：

 1 #include <cstdio>
 2 #include <iostream>
 3 #include <string>
 4
 5 using namespace std;
 6
 7 int main() {
 8     ios::sync_with_stdio(false);
 9     string st;
10     cin >> st;
11     cout << st << endl;
12     return 0;
13 }

上面程序的line 8是新加上去的一句话，"std::ios::sync_with_stdio(false);"，只要这句话，cin就和scanf的速度相差无几了。
"cin慢是有原因的，其实默认的时候，cin与stdin总是保持同步的，也就是说这两种方法可以混用，而不必担心文件指针混乱，同时cout和stdout也一样，两者混用不会输出顺序错乱。正因为这个兼容性的特性，导致cin有许多额外的开销，如何禁用这个特性呢？只需一个语句std::ios::sync_with_stdio(false);，这样就可以取消cin于stdin的同步了。" -- by byvoid

于是用c++读入中文这件破事终于是搞定了。顺带一提，表示从来用char*的窝又要去学过度封装的string类的操作了2333

时间： 2024-10-07 05:16:13

c++读入之 -- 汉字读入遇到的问题

c++读入之 -- 汉字读入遇到的问题的相关文章

R语言︱文件读入、读出一些方法罗列（批量xlsx文件、数据库、文本txt、文件夹）

读入优化&输出优化

c++中文件最后一个字符不能读入问题的解决

Java IO详解

Java IO流详解

Java与编码问题串讲之二–如何理解java采用Unicode编码

Java IO最详解

OCR技术浅探：特征提取(1)

关于JAVA IO流的学习