问题:在用WebRequest获取网页源码时得到的源码是乱码。
原因:1,编码不对
解决办法:设置对应编码
WebRequest request = WebRequest.Create(Url);
WebResponse response = await request.GetResponseAsync();
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream, Encoding.GetEncoding(coding));//这里的coding是页面的编码,可以用Ie右键查看编码。
Result = reader.ReadToEnd();
reader.Dispose();
reader.Dispose();
2,页面进行压缩了
看看html的head,ContentEncoding是否是gzip如果是的话需要解压。//下面的代码是在winrt下的
WebRequest request = WebRequest.Create(Url);
WebResponse response = await request.GetResponseAsync();
Debug.WriteLine(((HttpWebResponse)response).StatusDescription);
if (response.Headers.AllKeys.Contains("Content-Encoding") && response.Headers["Content-Encoding"].ToLower() == "gzip")//如果使用了GZip则先解压
{
using (System.IO.Stream streamReceive = response.GetResponseStream())
{
using (var zipStream =
new System.IO.Compression.GZipStream(streamReceive, System.IO.Compression.CompressionMode.Decompress))
{
using (StreamReader sr = new System.IO.StreamReader(zipStream, Encoding.GetEncoding(coding)))
{
Result = sr.ReadToEnd();
}
}
}
}
WebRequest 获取网页乱码