Unicode explorer


It can be cumbersome to work out some of the details of this by hand, so you can use the little Javascript-based tool below to display useful information about any string you can enter into the text field. Currently I don‘t have any support for going the other way (e.g. from UTF-16 code units to text) but hopefully this is still useful.

Enter text here:

Character Unicode UTF-16 UTF-8

This table breaks down the text in the text-box into Unicode characters. It does not perform any kind of normalization, so an accented character may appear as one character or more, depending on whether it is entered as a single character including the accent (e.g. é), or a non-accented character followed by combining characters (e.g. e? - yes, that really is different to the previous example; copy and paste them both to see!). However, it does break the input into Unicode characters instead of just UTF-16 code units; a surrogate pair is treated as a single character. For example, ?? (which apparently isn‘t a valid Unicode character, but appears to have a commonly understood meaning and glyph) is shown as U+20B20.

The first column simply displays the character. The second column displays the Unicode code point (U+0000 to U+10FFFF), suitable for looking up in Unicode code charts. The third column displays the UTF-16 code units which make up the character: these are the char values which would appear in a C# (or Java, or Javascript) script. For characters in the Basic Multilingual Plane this will just be a single code unit; for other characters it will be the surrogate pair (high then low). The fourth column displays the UTF-8 representation of the character in bytes.

参考:http://csharpindepth.com/Articles/General/Unicode.aspx

时间: 2024-08-07 08:25:49

Unicode explorer的相关文章

字符编码笔记:ASCII,Unicode和UTF-8【转载】

最近买了部安卓的手机,google nexus5 系统是安卓4.4.2. 刚到手就发现链接wifi有问题,一直在获取ip(obtaining ip...)和验证.试过恢复出厂 重启 各种都不管用,只有设置静态ip才可以,但是不能一直这样子呀!! 查了下路由器,路由器已经分配了地址.所以最大可能就是安卓手机上拿到这个地址没有成功写入配置文件,为什么没有写入呢,就是权限的问题了,不明白为什么google会出现这个错误. 因为不熟悉安卓系统,所以查了好几天,终于在一个外国网站上发现了下面这个解决办法,

如何使用BHO定制你的Internet Explorer浏览器

一.简介 有时,你可能需要一个定制版本的浏览器.在这种情况下,你可以自由地把一些新颖但又不标准的特征增加到一个浏览器上.结果,你最终有的只是一个新但不标准的浏览器.Web浏览器控件只是浏览器的分析引擎.这意味着仍然存在若干的与用户接口相关的工作等待你做――增加一个地址栏,工具栏,历史记录,状态栏,频道栏和收藏夹等.如此,要产生一个定制的浏览器,你可以进行两种类型的编程――一种象微软把Web浏览器控件转变成一个功能齐全的浏览器如Internet Explorer:一种是在现有的基础上加一些新的功能

精确解释Unicode

来自:http://blog.csdn.net/gqqnb/article/details/6266542 -------------------------------------------------------------- 我决心了解一下编码知识——主要是Unicode——及相关概念,搜索阅读了网上的很多文章,明白了一些,另一些却很模糊,而且有一些不同文章的描述是冲突的!我因此查阅了很多网上的资料,主要有中英文维基百科和Unicode.org,终于明白了其中的奥妙. 独乐乐不如众乐乐,

Google调用explorer.exe打开本地文件

给IE浏览器地址栏输个本地文件路径,会自动用explorer.exe打开,这个挺好的,但是IE对jQuery稍微高点的版本不怎么待见,只好自己给Google折腾一个调用explorer的功能------ 1.自定义URL Protocol 协议,让浏览器可以启动本地程序 2.编写c++控制台程序:解码从浏览器传递过来的url(url===utf-8===Unicode===gb2312)-->将链接路径头部去掉-->替换"|"为"\\"(解码的时候会把u

Unicode字符集和多字节字符集关系(转载)

Unicode字符集和多字节字符集关系 原文链接:http://blog.csdn.net/stephen1315/article/details/ 在计算机中字符通常并不是保存为图像,每个字符都是使用一个编码来表示的,而每个字符究竟使用哪个编码代表,要取决于使用哪个字符集(charset).      在最初的时候,Internet上只有一种字符集——ANSI的ASCII字符集,它使用7 bits来表示一个字符,总共表示128个字符,其中包括了英文字母.数字.标点符号等常用字符.之后,又进行扩

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Cha

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) http://www.joelonsoftware.com/articles/Unicode.html by Joel Spolsky Wednesday, October 08, 2003 Ever wonder about that myste

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Ever wonder about that mysterious Content-Type tag? You know, the one you're supposed to put in HTML and you never quite know what it should be? Did you ever get an email from your friends in Bulgaria with the subject line "???? ?????? ??? ????"

字符串和数字之间的转换(Unicode)

1 Unicode编码的字符串转换为数字类型 CString str; str = _T("1234"); int i = _ttoi(str); float f = _tstof(str); 2 数字转换为wchar_t wchar_t c[10]; int num = 100; _itow_s(num,c,10,10进制); wstring str(c); 3 wstring 转换为int wstring str; _wtoi(str.c_str); 那么究竟什么是Unicode?

js_字符转Unicode

在开发中总会遇到关于Unicode的转码和解码,每次都找工具转/解码很麻烦 ,今天在网上get到一个简单的转/解Unicode的函数. var UnicodeFun = { toUnicode: function(str) { if(str == '') return 'value is null'; return escape(str).toLocaleLowerCase().replace(/%u/gi, '\\u');; }, UnicodeDecode: function(str){ i