csharp: Converting chinese character to Unicode

Function chinese2unicode(Str)
    Dim Str_one:Str_one = ""
    Dim Str_unicode:Str_unicode = ""
    For i  = 1 To Len(Str)
        Str_one = Mid(Str, i, 1)
        If AscW(Str_one) < 0 or AscW(Str_one) > 255 Then
            Str_unicode = Str_unicode & Chr(38)
            Str_unicode = Str_unicode & Chr(35)
            Str_unicode = Str_unicode & Chr(120)
            Str_unicode = Str_unicode & Hex(AscW(Str_one))
            Str_unicode = Str_unicode & Chr(59)
        Else
            Str_unicode = Str_unicode & Str_one
        End If
    Next
    chinese2unicode=Str_unicode
End Function
 /// <summary>
        /// %26%23x4EB2%3B%26%23x7231%3B%26%23x7684%3B%26%23x4F1A%3B%26%23x5458%3BTeresaLiu%2C%26%23x516D%3B%26%23x798F%3B%26%23x73E0%3B%26%23x5BF6%3B%26%23x6703%3B%26%23x54E1%3B%26%23x5BC6%3B%26%23x78BC%3B%26%23x4FEE%3B%26%23x6539%3B%26%23x9805%3B%26%23x901A%3B%26%23x77E5%3B%26%23xFF1A%3B%26%23x95A3%3B%26%23x4E0B%3B%26%23x5DF2%3B%26%23x6210%3B%26%23x529F%3B%26%23x66F4%3B%26%23x6539%3B%26%23x5BC6%3B%26%23x78BC%3B%26%23xFF0C%3B%26%23x5982%3B%26%23x6709%3B%26%23x67E5%3B%26%23x8A62%3B%26%23xFF0C%3B%26%23x8ACB%3B%26%23x81F4%3B%26%23x96FB%3B%26%23x9999%3B%26%23x6E2F%3B27109368%26%23xFF0F%3B%26%23x4E2D%3B%26%23x570B%3B4008846222
        ///塗聚文 20140724
        /// </summary>
        /// <param name="str"></param>
        /// <returns></returns>
        private string chinese2uncode(string str)
        {
            string s = "";
            string outStr = "";
            if (!string.IsNullOrEmpty(str))
            {
                for (int i = 0; i < str.Length; i++)
                {
                    if (Microsoft.VisualBasic.Strings.AscW(str[i].ToString()) < 0 || Microsoft.VisualBasic.Strings.AscW(str[i].ToString())>255) //如果是中文转换Regex.IsMatch(str[i].ToString(), @"[\u4e00-\u9fa5]")
                    {
                        //outStr += "\\u" + ((int)str[i]).ToString("x");
                        outStr = outStr+(char)38;// "&";//char(38);
                        outStr = outStr + (char)35;// "#";
                        outStr = outStr + (char)120;// "x";
                        outStr = outStr + Microsoft.VisualBasic.Conversion.Hex(Microsoft.VisualBasic.Strings.AscW(str[i].ToString())); //outStr +
                        outStr = outStr + (char)59;// ";";
            //Str_unicode = Str_unicode & Chr(38)
            //Str_unicode = Str_unicode & Chr(35)
            //Str_unicode = Str_unicode & Chr(120)
            //Str_unicode = Str_unicode & Hex(AscW(Str_one))
                        //Str_unicode = Str_unicode & Chr(59)// ;

                    }
                    else
                    {
                        outStr += str[i];
                    }

                }
            }
            s = outStr;
            return s;
        }

csharp: Converting chinese character to Unicode

时间: 2024-10-03 23:34:08

csharp: Converting chinese character to Unicode的相关文章

IEF could not decode Chinese character in IE history well

My friend is working on some case, and she looks not in the mood. I ask her what's going on. She wants me to look at the screenshot as below. That's why she is upset...IEF could not decode Chinese character in IE history well, so the filenames in Chi

[Powershell] Convert Chinese character to pinying

Found a new job in Beijing, the company was using cloud stuff for infrastructure IT systems, like mail system, mobile messaging, IM. The company is going bigger and stronger, making more money, so the bosses decided update to AD/exchange/lync. It is

EnCase v7 could not recognize Chinese character folder names / file names on Linux Platform

Last week my friend brought me an evidence file duplicated from a Linux server, which distribution is CentOS 5.0 and the i18n is zh-tw. She wanna know whether there is any malware on this Linux server or not. OK. Let's get to work. I add this evidenc

How to show Chinese character by using Perl?

You need to usemodule:  Unicode::Map Where to get it? -http://search.cpan.org/~mschwartz/Unicode-Map-0.112/Map.pm Download Unicode-Map-0.112.tar.gz How to install it? -unzip the file andgo the folder: C:\Users\rebecca\Desktop\Temp\Unicode-Map-0.112>p

汉字转拼音再转ASCII

汉字可以转成拼音,可以在转成ASCII码,然后就可以转成十六进制数,再就可以转成0和1组成的二进制帧了! 比如说: 我爱你 -> wo ai ni -> 119 111 32 97 105 32 110 105 ->77 6F 20 61 69 20 6E 69 ->0111 0111 0110 1111 0010 0000 0110 0001 0110 1001 0010 0000  0110 1110 0110 1001 看上去很吊的样子,估计发报就是这么干的! package

中文和Unicode互相转化

Unicode转中文 String unicode = "\u6211\u7231\u7956\u56fd.mp3"; String result = new String(unicode.getBytes("UTF-8"), "UTF-8"); System.out.println(result); 结果:我爱祖国 中文转Unicode String chinese = "我爱祖国"; StringBuffer unicod

Unicode(UTF-8, UTF-16)令人混淆的概念

为啥需要Unicode 我们知道计算机其实挺笨的,它只认识0101这样的字符串,当然了我们看这样的01串时肯定会比较头晕的,所以很多时候为了描述简单都用十进制,十六进制,八进制表示.实际上都是等价的,没啥太多不一样.其他啥文字图片之类的其他东东计算机不认识.那为了在计算机上表示这些信息就必须转换成一些数字.你肯定不能想怎么转换就怎么转,必须得有定些规则.于是刚开始的时候就有ASCII字符集(American Standard Code for Information Interchange, "

unicode gbk utf-8的差异

GB2312(1980年)定义,包含6763个汉字,682个字符 GBK1.0 定义了21003个汉字,21886个字符 ASCII->GB2312->GBK 编码方式向后兼容,即同一个字符在这些编码方案中总是有相同的编码,只是越到后面支持的字符更多 区分中文编码的方法是高字节的最高位不为0(@todo),两个字节的最高位都是1 GB2312,GBK都属于双字节字符集 GB18030(2000年)取代GBK1.0成为正式的国家标准,定义了27484个汉字.编码采用单字节,双字节,四字节(四字节

【转】Unicode(UTF-8, UTF-16)令人混淆的概念

参考地址:http://www.cnblogs.com/kingcat/archive/2012/10/16/2726334.html Java中,char类型用UTF-16编码描述一个代码单元 为啥需要Unicode 我们知道计算机其实挺笨的,它只认识0101这样的字符串,当然了我们看这样的01串时肯定会比较头晕的,所以很多时候为了描述简单都用十进制,十六进制,八进制表示.实际上都是等价的,没啥太多不一样.其他啥文字图片之类的其他东东计算机不认识.那为了在计算机上表示这些信息就必须转换成一些数