使用python2.7处理unicode的字符串,环境变量已设置PYTHONIOENCODING为utf-8,cmd编码为utf-8时print unicode字符串会报错[Errno 0]或[Errno 2]
#coding:utf-8 import os os.system("chcp 65001") a = u"你好こんにちは" print a
此时会报错,如果字符串只含ASCII字符就不会报错,如果cmd用其他编码则可能输出乱码但不会报错
经查这是windows实现C函数的问题
https://bugs.python.org/issue1602#msg148990
The underlying cause of Python‘s write exceptions with cp65001 is: The ANSI C write() function as implemented by the Windows console returns the number of _characters_ written rather than the number of _bytes_, which Python reasonably interprets as a "short write error". It then consults errno, which gives the effectively random error message seen. This can be bypassed by using os.write(sys.stdout.fileno(), utf8str), which will a) succeed and b) return a count <= len(utf8str). With os.write() and an appropriate font, the Windows console will correctly display a large number of characters. Possible workaround: clear errno before calling write, check for non-zero errno after. The vast majority of (non-Python) applications never check the return value of write, so don‘t encounter this problem.
解决方法
方法1 使用win_unicode_console模块
1.安装
pip install win_unicode_console
2.使用
很简单,导入后设置开启就行
#coding:utf-8 import os import win_unicode_console win_unicode_console.enable() os.system("chcp 65001") a = u"你好こんにちは" print a
方法2 不使用print
根据issue的描述,可以用os.write(sys.stdout.fileno(), utf8str)的方式绕过
此时字符串不加u前缀,直接写入str类型
#coding:utf-8 import os import sys os.system("chcp 65001") a = "你好こんにちは" os.write(sys.stdout.fileno(), a)
偷懒方法
1.使用pycharm执行不会报错,推测pycharm自行修复了这个问题
2.只输出中文的话,那就不用utf8了,直接chcp 936然后输出a.encode("gbk","ignore")
原文地址:https://www.cnblogs.com/LiuZhongbin888/p/10770232.html
时间: 2024-10-25 19:34:26