c# 如何通过Socket访问网站资源

  • 引言

在C#中,我们通常会使用HttpWebRequest来访问url资源,例子如下:

public static string GetContentFromUrl(string url)
        {
            HttpWebResponse response = null;
            WebRequest request;
            try
            {
                request = WebRequest.Create(url);
                // make sure it accept gzip
                request.Headers.Set("Accept-Encoding", "gzip, deflate");
                request.Timeout = 100000;
                response = (HttpWebResponse)request.GetResponse();
                Stream responseStream = response.GetResponseStream();
                string sXML = null;
                Stream unzipStream = null;
                StreamReader outSr = null;
                switch(response.ContentEncoding)
                {
                    case "gzip":
                        // get the gzip stream and unzipped
                        unzipStream = new GZipStream(responseStream, CompressionMode.Decompress);
                        outSr = new StreamReader(unzipStream);
                        break;
                    case "deflate":
                        unzipStream = new DeflateStream(responseStream, CompressionMode.Decompress);
                        outSr = new StreamReader(unzipStream);
                        break;
                    default:
                        outSr = new StreamReader(responseStream);
                        break;
                }
                sXML = outSr.ReadToEnd();
                response.Close();
            }
            catch (WebException ex)
            {
                Console.WriteLine(ex.Message);
            }
            return null;
        }

当遇到HttpStatusCode>400时,通常会被认为是WebException,错误消息会带上StatusCode如

The remote server returned an error: (401) Unauthorized.

有时候,当401发生时,服务器会在Response里告诉你具体的错误信息,如

<XOIException><StatusCode>40100</StatusCode><StatusMessage>Unknown login error</StatusMessage></XOIException>

因为HttpWebRequest类会把HttpStatusCode>=400当做WebException,这样的话我们就无法获取到Response Stream中的内容啦。

  • 怎样才能获取具体的错误信息,然后根据错误信息做具体的处理呢?

我们可以通过Socket来访问url资源,下面是具体实现的类

RequestHeader

namespace Com.Morningstar.EquityData.XOIAccessor.Http
{
    public static class RequestHeader
    {
        public const string Host = "Host";
        public const string AcceptEncoding = "Accept-Encoding";
        public const string AcceptLanguage = "Accept-Language";
        public const string Accept = "Accept";
        public const string Connection = "Connection";
        public const string Cookie = "Cookie";
        public const string UserAgent = "User-Agent";
        public const string ContentType = "Content-Type";
        public const string ContentLength = "Content-Length";
        
    }
    public static class ResponseHeader
    {
        public static string ContentLength = "Content-Length";
        public static string ContentType = "Content-Type";
        public static string ContentEncoding = "Content-Encoding";
        public static string SetCookie = "Set-Cookie";
        
    }
    public static class Connection
    {
        public static string KeepAlive = "Keep-Alive";
        public static string Close = "Close";
    }
}

HttpMethod

namespace Com.Morningstar.EquityData.XOIAccessor.Http
{
    public enum HttpMethod
    {
        GET,POST
    }
}

HttpException

namespace Com.Morningstar.EquityData.XOIAccessor.Http
{
    public class HttpException:Exception
    {
        public HttpException(string message) : base(message) { }
        public HttpException(string message, Exception innerException) : base(message, innerException) { }
    }
}

HttpRequest:建立Http请求,并且返回HttpResponse

namespace Com.Morningstar.EquityData.XOIAccessor.Http
{
    public class HttpRequest
    {
        internal HttpRequest()
        {
        }
        public string Host { set; get; }
        private int port = 80;
        public int Port
        {
            set
            {
                if (port > 0)
                {
                    port = value;
                }
            }
            get
            {
                return port;
            }
        }
        private HttpMethod method = HttpMethod.GET;
        public HttpMethod Method
        {
            set
            {
                method = value;
            }
            get
            {
                return method;
            }
        }
        private string path = "/";
        public string Path
        {
            set
            {
                path = value;
            }
            get
            {
                return path;
            }
        }
        private NameValueCollection headers = new NameValueCollection();
        public NameValueCollection Headers
        {
            set
            {
                headers = value;
            }
            get
            {
                return headers;
            }
        }
        public void AddHeader(string name, string value)
        {
            headers[name] = value;
        }
        public string Body { set; get; }
        /// <summary>
        /// Millseconds to wait response
        /// </summary>
        private int timeout = -1;//Never time out
        public int Timeout
        {
            set
            {
                if (timeout < -1)
                {
                    throw new ArgumentOutOfRangeException("Timeout is less than -1");
                }
                timeout = value;
            }
            get
            {
                return timeout;
            }
        }
        private void CheckReqiredParameters()
        {
            if (string.IsNullOrEmpty(Host))
            {
                throw new ArgumentException("Host is blank");
            }
        }
        public string BuilSocketRequest()
        {
            StringBuilder requestBuilder = new StringBuilder();
            FillHeader();
            BuildRequestLine(requestBuilder);
            BuildRequestHeader(requestBuilder);
            BuildRequestBody(requestBuilder);
            return requestBuilder.ToString();
        }
        private void FillHeader()
        {
            if (Method.Equals(HttpMethod.POST))
            {
                if (string.IsNullOrEmpty(Headers[RequestHeader.ContentType]))
                {
                    Headers[RequestHeader.ContentType] = "application/x-www-form-urlencoded";
                }
                if (!string.IsNullOrEmpty(Body) && string.IsNullOrEmpty(Headers[RequestHeader.ContentLength]))
                {
                    Headers[RequestHeader.ContentLength] = Encoding.Default.GetBytes(Body).Length.ToString();
                }
            }
            if (!string.IsNullOrEmpty(Headers[RequestHeader.Connection]))
            {
                Headers[RequestHeader.Connection] = Connection.Close;
            }
            if (string.IsNullOrEmpty(Headers[RequestHeader.Accept]))
            {
                Headers[RequestHeader.Accept] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
            }
            if (string.IsNullOrEmpty(Headers[RequestHeader.UserAgent]))
            {
                Headers[RequestHeader.UserAgent] = "Mozilla/5.0 (Windows NT 6.1; IE 9.0)";
            }
            if (string.IsNullOrEmpty(Headers[RequestHeader.AcceptEncoding]))
            {
                Headers[RequestHeader.AcceptEncoding] = "gzip, deflate";
            }
            if (string.IsNullOrEmpty(Headers[RequestHeader.Host]))
            {
                Headers[RequestHeader.Host] = Host;
            }
        }
        private void BuildRequestLine(StringBuilder requestBuilder)
        {
            if (Method.Equals(HttpMethod.POST))
            {
                requestBuilder.AppendLine(string.Format("POST {0} HTTP/1.1", Path));
            }
            else
            {
                requestBuilder.AppendLine(string.Format("GET {0} HTTP/1.1", Path));
            }
        }
        private void BuildRequestHeader(StringBuilder requestBuilder)
        {
            foreach (string name in Headers)
            {
                requestBuilder.AppendLine(string.Format("{0}: {1}", name, Headers[name]));
            }
        }
        private void BuildRequestBody(StringBuilder requestBuilder)
        {
            requestBuilder.AppendLine();
            if (!string.IsNullOrEmpty(Body))
            {
                requestBuilder.Append(Body);
            }
        }
        public HttpResponse GetResponse()
        {
            CheckReqiredParameters();
            HttpResponse httpResponse = new HttpResponse();
            string socketRequest = BuilSocketRequest();
            byte[] requestBytes = Encoding.ASCII.GetBytes(socketRequest);
            try
            {
                using (Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
                {
                    socket.ReceiveTimeout = Timeout;
                    socket.Connect(Host, Port);
                    
                    if (socket.Connected)
                    {
                        socket.Send(requestBytes);
                        ParseResponseLine(socket, httpResponse);
                        ParseResponseHeader(socket, httpResponse);
                        ParseResponseBody(socket, httpResponse);
                        socket.Close();
                    }
                }
            }
            catch (Exception e)
            {
                throw new HttpException("Get response failure. Host:" + Host + ", Port:" + Port + ",RequestString:" + socketRequest, e);
            }
            return httpResponse;
        }
        private void ParseResponseLine(Socket socket, HttpResponse response)
        {
            string responseLine = ReceiveCharBytes(socket, "\r\n");
            responseLine = responseLine.Replace("\r\n", "");
            string[] fields = responseLine.Split(‘ ‘);
            if (fields.Length >= 3)
            {
                response.StatusCode = fields[1];
                response.StatusDescription = responseLine.Substring(responseLine.IndexOf(fields[1]) + fields[1].Length + 1);
            }
            else
            {
                throw new HttpException("The response line:‘" + responseLine + "‘ has the wrong format.");
            }
        }
        private void ParseResponseHeader(Socket socket, HttpResponse response)
        {
            string responseHeader = ReceiveCharBytes(socket, "\r\n\r\n");
            string[] headerArry = Regex.Split(responseHeader, "\r\n");
            if (headerArry != null)
            {
                foreach (string header in headerArry)
                {
                    if (!string.IsNullOrEmpty(header))
                    {
                        int start = header.IndexOf(":");
                        if (start > 0)
                        {
                            string name = header.Substring(0, start);
                            string value = "";
                            if(header.Length>start+2){
                                value = header.Substring(start + 2);
                            }
                            response.AddHeader(name, value);
                        }
                    }
                }
            }
        }
        private string ReceiveCharBytes(Socket socket, string breakFlag)
        {
            StringBuilder builder = new StringBuilder();
            while (true)
            {
                byte[] buff = new byte[1];
                int read = socket.Receive(buff, SocketFlags.None);
                if (read > 0)
                {
                    builder.Append((char)buff[0]);
                }
                if (builder.ToString().EndsWith(breakFlag))
                {
                    break;
                }
            }
            return builder.ToString();
        }
        private void ParseResponseBody(Socket socket, HttpResponse response)
        {
            string contentLen = response.GetHeader(ResponseHeader.ContentLength);
            bool bodyDone = false;
            if (!string.IsNullOrEmpty(contentLen))
            {
                int len = Convert.ToInt32(contentLen);
                if (len > 0)
                {
                    byte[] contentBytes = new byte[len];
                    if (socket.Receive(contentBytes) > 0)
                    {
                        response.Body = contentBytes;
                    }
                    bodyDone = true;
                }
            }
            if (!bodyDone)
            {
                List<byte[]> readsList = new List<byte[]>();
                int totalLength = 0;
                while (true)
                {
                    byte[] buff = new byte[1024];
                    int readLen = socket.Receive(buff);
                    if (readLen > 0)
                    {
                        totalLength += readLen;
                        byte[] reads = new byte[readLen];
                        Array.Copy(buff, 0, reads, 0, readLen);
                        readsList.Add(reads);
                    }
                    else
                    {
                        break;
                    }
                }
                byte[] fullBytes = new byte[totalLength];
                int index = 0;
                foreach (byte[] reads in readsList)
                {
                    Array.Copy(reads, 0, fullBytes, index, reads.Length);
                    index += reads.Length;
                }
                response.Body = fullBytes;
            }
        }
        private string GetResponseHeader(Socket socket)
        {
            StringBuilder builder = new StringBuilder();
            while (true)
            {
                byte[] buff = new byte[1];
                int read = socket.Receive(buff, SocketFlags.None);
                if (read > 0)
                {
                    builder.Append((char)buff[0]);
                }
                if (builder.ToString().Contains("\r\n\r\n"))
                {
                    break;
                }
            }
            return builder.ToString();
        }
        public static HttpRequest Create(string url)
        {
            Uri uri = new Uri(url);
            HttpRequest request = new HttpRequest();
            request.Host = uri.Host;
            request.Port = uri.Port;
            request.Path = uri.PathAndQuery;
            return request;
        }
    }
}

HttpResponse:对Http的响应流进行封装

namespace Com.Morningstar.EquityData.XOIAccessor.Http
{
    public class HttpResponse
    {
        internal HttpResponse()
        {
        }
        #region Response Line
        public string StatusCode { internal set; get; }
        public string StatusDescription{ internal set;get; }
        #endregion
        #region Response Headers
        private NameValueCollection headers = new NameValueCollection();
        public NameValueCollection Headers { get { return headers; } }
        internal void AddHeader(string name, string value)
        {
            headers[name] = value;
        }
        public string GetHeader(string name)
        {
            return headers[name];
        }
        public long? ContentLength
        {
            get
            {
                if(!string.IsNullOrEmpty(GetHeader(ResponseHeader.ContentLength)))
                {
                    return Convert.ToInt64(GetHeader(ResponseHeader.ContentLength));
                }
                return null;  
            }
        }
        public string ContentEncoding
        {
            get
            {
                return GetHeader(ResponseHeader.ContentEncoding);
            }
        }
        #endregion
        public byte[] Body { internal set; get; }
        public Stream GetBodyStream()
        {
            if (Body != null)
            {
                return new MemoryStream(Body);
            }
            return null;
        }
    }
}
  • 如何使用HttpRequest类
public string GetContent(string url)
        {
            Login();
            
HttpRequest request = HttpRequest.Create(url);
            request.Method = HttpMethod.GET;
            request.AddHeader(RequestHeader.AcceptEncoding, "gzip, deflate");
            request.AddHeader(RequestHeader.Cookie, AuthCookie);
            request.Timeout = Timeout;
            
            HttpResponse resp = request.GetResponse();
            string xoiErrorCode = resp.GetHeader("X-XOI-ErrorCode");
            if (!string.IsNullOrEmpty(xoiErrorCode))
            {
                if (!xoiErrorCode.Equals(XOIErrorCode.XOI_EC_40401))
                {
                    XOIException xoiException = new XOIException("Get content fail. Url:" + url);
                    string errorContent = ReadContent(resp);
                    XmlDocument doc = new XmlDocument();
                    doc.LoadXml(errorContent);
                    XmlNode statusCodeNode = doc.SelectSingleNode(@"/XOIException/StatusCode");
                    XmlNode statusMessageNode = doc.SelectSingleNode(@"/XOIException/StatusMessage");
                    if (statusCodeNode != null)
                    {
                        xoiException.XOIErrorCode = statusCodeNode.Value;
                    }
                    if (statusMessageNode != null)
                    {
                        xoiException.XOIErrorInfo = statusMessageNode.Value;
                    }
                    throw xoiException;
                }
                else
                {
                    return string.Empty;
                }
            }
            return ReadContent(resp);
        }
        protected string ReadContent(HttpResponse resp)
        {
            StreamReader reader = null;
            try
            {
                switch (resp.ContentEncoding)
                {
                    case "gzip":
                        reader = new StreamReader(new GZipStream(resp.GetBodyStream(), CompressionMode.Decompress));
                        break;
                    case "deflate":
                        reader = new StreamReader(new DeflateStream(resp.GetBodyStream(), CompressionMode.Decompress));
                        break;
                    default:
                        reader = new StreamReader(resp.GetBodyStream());
                        break;
                }
                return reader.ReadToEnd();
            }
            finally
            {
                if (reader != null)
                {
                    reader.Close();
                }
            }
        }

c# 如何通过Socket访问网站资源,布布扣,bubuko.com

时间: 2024-12-23 05:29:39

c# 如何通过Socket访问网站资源的相关文章

用户访问网站的完整流程

在浏览器输入想要访问的域名之后,浏览器会进行域名解析获得IP地址,在经过TCP的连接,实现数据的传输就会有两种报文,及请求报文和响应报文.最终才能实现通信.因此想要实现通信,就得先弄懂DNS的解析原理以及TCP连接通道的流程. 理论内容: 1.DNS的介绍以及原理 2.TCP/IP协议的介绍和三次握手及四次挥手 3.HTTP协议的介绍及请求与响应报文 4.用户访问浏览器的完整过程 1.DNS的介绍及解析原理 1)说说DNS是什么: DNS(Domain Name System,域名系统),因特网

老男孩教育每日一题-2017年3月22日:请说明用户访问网站流程

本题目也可以说为: 描述从浏览器打开http://www.oldboyedu.com地址回车发送请求到看到页面的过程? 打开浏览器输入网址回车,到看到页面的过程 大纲: 1.用户访问网站流程框架 2.DNS解析原理 ***** 3.tcp/ip三次握手原理 ***** 4.http协议原理(www服务的请求过程)请求报文细节! 5.大规模网站集群架构细节. 6.http协议原理(www服务的响应过程)响应报文细节! 7.tcp/ip四次挥手过程原理 ***** , 当我们打开浏览器输入网址回车,

访问网站的流程详解

前言: 首先讲一下网站成型所需的步骤: 第一:规划网站(包括网站定位.网站名称.网站功能等等): 第二:注册域名: 第三:制作网站程序(建议选择成熟的网站管理系统): 第四:购买空间(空间务必要支持程序运行所需要的环境): 第五:网站备案: 第六:解析域名并将域名绑定至空间: 第七:上传网站程序至网站空间: 第八:安装配置网站: 第九:添加网站内容: 具体访问过程如下图: 当我们访问网站的时候,需在地址栏中输入域名,如下图所示: 对上图解释,其中: https协议:是一种由HTTP和SSL/TL

IIS部署网站后, 无法正常访问网站问题

IIS部署网站后, 无法正常访问网站问题,并且提示503错误,而且对应的应用程序池自动停止 在系统日志中可以跟踪到错误信息 "应用程序池"Lee_Integration_web"将被自动禁用,原因是为此应用程序池提供服务的进程中出现一系列错误." "Windows Process Activation Service 未能为应用程序池"Lee_Integration_web"创建工作进程.数据字段包含错误号." 错误代码:80

如何配置阿里公共DNS——上网加速、无广告、无劫持、访问网站响应快

我们都知道,我们要能上网,就必须要使用DNS,DNS是域名系统,能够使用户更方便的使用互联网,而不用去记住能够被机器直接读取的IP数串,也叫域名 解析.百度也曾经被黑客攻击过DNS,导致无法访问.DNS作为互联网的入口,越来越受到大家的重视.如此重要的东西,也常常被那些隐藏在角落里的黑客惦 记着,主动攻击每天都在发生着.攻击一旦成功,整个互联网的访问就瘫痪了.所以我们有必要更新我们一贯的DNS服务器解析地址.不过发生什么网上DNS国 际攻击,我们都能通过阿里DNS访问国内网站哦,还等什么~ 公众

JS版本网站资源状态检测

Title:JS版本网站资源状态检测  --2012-08-28 14:08 前几天需要一个网站状态检测的东东,后面写了个蹩脚的JS版本,里面用到了以前没用过的东西,在这里记下来,其实批处理加curl可以解决得很好,此脚本不能检测网络状态,只能检测网页资源,是否存在异常,如500,400,403错误等.. ------------------------------------------------------------------------------------------------

Nginx屏蔽个别User-Agent蜘蛛访问网站的方法

对于做国内站的我来说,我不希望国外蜘蛛来访问我的网站,特别是个别垃圾蜘蛛,它们访问特别频繁.这些垃圾流量多了之后,严重浪费服务器的带宽和资源.通过判断user agent,在nginx中禁用这些蜘蛛可以节省一些流量,也可以防止一些恶意的访问. 步骤 1.进入nginx的配置目录,例如cd /usr/local/nginx/conf 2.添加agent_deny.conf配置文件 #禁止Scrapy等工具的抓取 if ($http_user_agent ~* (Scrapy|Curl|HttpCl

用户访问网站的流程

请说明用户访问网站流程 2017-04-10 16:25:17 原创作品,允许转载,转载时请务必以超链接形式标明文章 原始出处 .作者信息和本声明.否则将追究法律责任.http://lidao.blog.51cto.com/3388056/1914578 本题目也可以说为: 描述从浏览器打开http://www.oldboyedu.com地址回车发送请求到看到页面的过程? 打开浏览器输入网址回车,到看到页面的过程 大纲: 1.用户访问网站流程框架 2.DNS解析原理 ***** 3.tcp/ip

浏览器访问网站的过程

前言 当我们在浏览器中输入一个网址,比如www.百度.com,浏览器就会加载出百度的主页.那么浏览器背后完成的具体是怎么样的呢?总结起来大概的流程是这样的: (1)浏览器本身是一个客户端,当你输入URL的时候,首先浏览器会去请求DNS服务器,通过DNS获取相应的域名对应的IP (2)然后通过IP地址找到IP对应的服务器后,要求建立TCP连接 (3)浏览器发送完HTTP Request(请求)包后,服务器接收到请求包之后才开始处理请求包 (4)在服务器收到请求之后,服务器调用自身服务,返回HTTP