requests库核心API源码分析

requests库是python爬虫使用频率最高的库,在网络请求中发挥着重要的作用,这边文章浅析requests的API源码。

该库文件结构如图:

提供的核心接口在__init__文件中,如下:

from . import utils
from . import packages
from .models import Request, Response, PreparedRequest
from .api import request, get, head, post, patch, put, delete, options
from .sessions import session, Session
from .status_codes import codes
from .exceptions import (

    RequestException, Timeout, URLRequired,

    TooManyRedirects, HTTPError, ConnectionError,

    FileModeWarning, ConnectTimeout, ReadTimeout

)

requests常用方法在api.py文件中,源码如下:

# -*- coding: utf-8 -*-

"""

requests.api

~~~~~~~~~~~~

This module implements the Requests API.

:copyright: (c) 2012 by Kenneth Reitz.

:license: Apache2, see LICENSE for more details.

"""

from . import sessions

def request(method, url, **kwargs):

    """Constructs and sends a :class:`Request <Request>`.

    :param method: method for the new :class:`Request` object.

    :param url: URL for the new :class:`Request` object.

    :param params: (optional) Dictionary, list of tuples or bytes to send

        in the body of the :class:`Request`.

    :param data: (optional) Dictionary, list of tuples, bytes, or file-like

        object to send in the body of the :class:`Request`.

    :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.

    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.

    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.

    :param files: (optional) Dictionary of ``‘name‘: file-like-objects`` (or ``{‘name‘: file-tuple}``) for multipart encoding upload.

        ``file-tuple`` can be a 2-tuple ``(‘filename‘, fileobj)``, 3-tuple ``(‘filename‘, fileobj, ‘content_type‘)``

        or a 4-tuple ``(‘filename‘, fileobj, ‘content_type‘, custom_headers)``, where ``‘content-type‘`` is a string

        defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers

        to add for the file.

    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.

    :param timeout: (optional) How many seconds to wait for the server to send data

        before giving up, as a float, or a :ref:`(connect timeout, read

        timeout) <timeouts>` tuple.

    :type timeout: float or tuple

    :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.

    :type allow_redirects: bool

    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.

    :param verify: (optional) Either a boolean, in which case it controls whether we verify

            the server‘s TLS certificate, or a string, in which case it must be a path

            to a CA bundle to use. Defaults to ``True``.

    :param stream: (optional) if ``False``, the response content will be immediately downloaded.

    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, (‘cert‘, ‘key‘) pair.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    Usage::

      >>> import requests

      >>> req = requests.request(‘GET‘, ‘https://httpbin.org/get‘)

      <Response [200]>

    """

    # By using the ‘with‘ statement we are sure the session is closed, thus we

    # avoid leaving sockets open which can trigger a ResourceWarning in some

    # cases, and look like a memory leak in others.

    with sessions.Session() as session:

        return session.request(method=method, url=url, **kwargs)

def get(url, params=None, **kwargs):

    r"""Sends a GET request.

    :param url: URL for the new :class:`Request` object.

    :param params: (optional) Dictionary, list of tuples or bytes to send

        in the body of the :class:`Request`.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    kwargs.setdefault(‘allow_redirects‘, True)

    return request(‘get‘, url, params=params, **kwargs)

def options(url, **kwargs):

    r"""Sends an OPTIONS request.

    :param url: URL for the new :class:`Request` object.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    kwargs.setdefault(‘allow_redirects‘, True)

    return request(‘options‘, url, **kwargs)

def head(url, **kwargs):

    r"""Sends a HEAD request.

    :param url: URL for the new :class:`Request` object.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    kwargs.setdefault(‘allow_redirects‘, False)

    return request(‘head‘, url, **kwargs)

def post(url, data=None, json=None, **kwargs):

    r"""Sends a POST request.

    :param url: URL for the new :class:`Request` object.

    :param data: (optional) Dictionary, list of tuples, bytes, or file-like

        object to send in the body of the :class:`Request`.

    :param json: (optional) json data to send in the body of the :class:`Request`.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    return request(‘post‘, url, data=data, json=json, **kwargs)

def put(url, data=None, **kwargs):

    r"""Sends a PUT request.

    :param url: URL for the new :class:`Request` object.

    :param data: (optional) Dictionary, list of tuples, bytes, or file-like

        object to send in the body of the :class:`Request`.

    :param json: (optional) json data to send in the body of the :class:`Request`.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    return request(‘put‘, url, data=data, **kwargs)

def patch(url, data=None, **kwargs):

    r"""Sends a PATCH request.

    :param url: URL for the new :class:`Request` object.

    :param data: (optional) Dictionary, list of tuples, bytes, or file-like

        object to send in the body of the :class:`Request`.

    :param json: (optional) json data to send in the body of the :class:`Request`.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    return request(‘patch‘, url, data=data, **kwargs)

def delete(url, **kwargs):

    r"""Sends a DELETE request.

    :param url: URL for the new :class:`Request` object.

    :param \*\*kwargs: Optional arguments that ``request`` takes.

    :return: :class:`Response <Response>` object

    :rtype: requests.Response

    """

    return request(‘delete‘, url, **kwargs)

常用的get、post、put、optins、delete方法都在该文件中实现,这些方法都是使用内部封装的一个模块:request,而request是对session.request内部模块的封装,提供一个上下文管理。

继续看最为核心的session.request模块源码:

def request(self, method, url,

       ·······

        # Create the Request.

        req = Request(

            method=method.upper(),

            url=url,

            headers=headers,

            files=files,

            data=data or {},

            json=json,

            params=params or {},

            auth=auth,

            cookies=cookies,

            hooks=hooks,

        )

        prep = self.prepare_request(req)

        proxies = proxies or {}

        settings = self.merge_environment_settings(

            prep.url, proxies, stream, verify, cert

        )

        # Send the request.

        send_kwargs = {

            ‘timeout‘: timeout,

            ‘allow_redirects‘: allow_redirects,

        }

        send_kwargs.update(settings)

        resp = self.send(prep, **send_kwargs)

        return resp

在这里提交过来的请求信息将组装成Request请求对象,并对其中的配置参数进行合并,然后将Request请求和配置参数发送给self.send,来请求下载,继续看self.send

 def send(self, request, **kwargs):

        """Send a given PreparedRequest.

        :rtype: requests.Response

        """

        # Set defaults that the hooks can utilize to ensure they always have

        # the correct parameters to reproduce the previous request.

        kwargs.setdefault(‘stream‘, self.stream)

        kwargs.setdefault(‘verify‘, self.verify)

        kwargs.setdefault(‘cert‘, self.cert)

        kwargs.setdefault(‘proxies‘, self.proxies)

        # It‘s possible that users might accidentally send a Request object.

        # Guard against that specific failure case.

        if isinstance(request, Request):

            raise ValueError(‘You can only send PreparedRequests.‘)

        # Set up variables needed for resolve_redirects and dispatching of hooks

        allow_redirects = kwargs.pop(‘allow_redirects‘, True)

        stream = kwargs.get(‘stream‘)

        hooks = request.hooks

        # Get the appropriate adapter to use

        adapter = self.get_adapter(url=request.url)

        # Start time (approximately) of the request

        start = preferred_clock()

        # Send the request

        r = adapter.send(request, **kwargs)

        # Total elapsed time of the request (approximately)

        elapsed = preferred_clock() - start

        r.elapsed = timedelta(seconds=elapsed)

        # Response manipulation hooks

        r = dispatch_hook(‘response‘, hooks, r, **kwargs)

        # Persist cookies

        if r.history:

            # If the hooks create history then we want those cookies too

            for resp in r.history:

                extract_cookies_to_jar(self.cookies, resp.request, resp.raw)

        extract_cookies_to_jar(self.cookies, request, r.raw)

        # Redirect resolving generator.

        gen = self.resolve_redirects(r, request, **kwargs)

        # Resolve redirects if allowed.

        history = [resp for resp in gen] if allow_redirects else []

        # Shuffle things around if there‘s history.

        if history:

            # Insert the first (original) request at the start

            history.insert(0, r)

            # Get the last request made

            r = history.pop()

            r.history = history

        # If redirects aren‘t being followed, store the response on the Request for Response.next().

        if not allow_redirects:

            try:

                r._next = next(self.resolve_redirects(r, request, yield_requests=True, **kwargs))

            except StopIteration:

                pass

        if not stream:

            r.content

        return r

当然在self.send中核心的是下面几行行代码:

# Start time (approximately) of the request

        start = preferred_clock()

        # Send the request

        r = adapter.send(request, **kwargs)

        # Total elapsed time of the request (approximately)

        elapsed = preferred_clock() - start

        r.elapsed = timedelta(seconds=elapsed)

        # Response manipulation hooks

        r = dispatch_hook(‘response‘, hooks, r, **kwargs)

如果还有问题未能得到解决,搜索887934385交流群,进入后下载资料工具安装包等。最后,感谢观看!

分别进行请求,并将请求响应内容构造成响应对象r,其中又引入本地模块adapter,该模块主要负责请求处理及其响应内容。

requests库实现很巧妙,对cookie保持、代理问题、SSL验证问题都做了处理,功能很全,其中细节不仔细去研读很难理解,这里只是对其实现过程做一个浅析,如果有感兴趣的同学,可以仔细研读每个模块和功能,其中有奥妙。



原文地址:https://www.cnblogs.com/pypypy/p/12003908.html

时间: 2024-11-10 07:24:21

requests库核心API源码分析的相关文章

nova api源码分析(一)

说明: 源码版本:H版 参考文档:http://www.choudan.net/2013/12/09/OpenStack-WSGI-APP%E5%AD%A6%E4%B9%A0.html 一.前奏 nova api本身作为一个WSGI服务器,对外提供HTTP请求服务,对内调用nova的其他模块响应相应的HTTP请求.分为两大部分,一是创建该服务器时加载的app,这个用来处理请求:一是服务器本身的启动与运行. 目录结构如下: 首先,nova api是作为一个WSGI服务,肯定要查看它的启动过程,查看

nova api源码分析(二)

转载于:http://www.it165.net/pro/html/201407/17020.html (经过部分编辑) 一.使用到的库或组件如下: paste.deploy 用来解析/etc/nova/api-paste.ini文件,加载用于服务的wsgi app.它的功能有: 1.api-paste.ini中配置多个wsgi app,deploy可根据传入的app name加载指定的wsgi app: deploy.loadapp("config:/etc/nova/api-paste.in

#tensorflow object detection api 源码分析

前言 Tensorflow 推出的 Object Detection API是一套抽象程度极高的目标检测框架,可以快速用于生产部署.但网络上大多数相关的中英文文章均只局限于应用层面的分析,对于该套框架的算法实现源码没有针对性的分析文章.对于选择tensorflow作为入门框架的深度学习新手,不仅应注重于算法本身的理解,更应注重算法的编码实现.本人也是刚入门深度学习的新手,深深困扰于tensorflow 目标检测框架的抽象代码,因此花费了大量时间分析源码,希望能对博友有益,同时受限于眼界,文章中必

ABP源码分析二十六:核心框架中的一些其他功能

本文是ABP核心项目源码分析的最后一篇,介绍一些前面遗漏的功能 AbpSession AbpSession: 目前这个和CLR的Session没有什么直接的联系.当然可以自定义的去实现IAbpSession使之与CLR的Session关联 IAbpSession:定义如下图中的四个属性. NullAbpSession:IAbpSession的一个缺省实现,给每个属性都给予null值,无实际作用 ClaimsAbpSession:实现了从ClaimsPrincipal/ClaimsIdentity

Android万能适配器base-adapter-helper的源码分析

项目地址:https://github.com/JoanZapata/base-adapter-helper 1. 功能介绍 1.1. base-adapter-helper base-adapter-helper 是对传统的 BaseAdapter ViewHolder 模式的一个封装.主要功能就是简化我们书写 AbsListView 的 Adapter 的代码,如 ListView,GridView. 1.2 基本使用 mListView.setAdapter(mAdapter = new

Android base-adapter-helper 源码分析与扩展

转载请标明出处:http://blog.csdn.net/lmj623565791/article/details/44014941,本文出自:[张鸿洋的博客] 本篇博客是我加入Android 开源项目源码解析分析的一篇文章,初次加入,所以选了个比较简单的库,如果你曾经看过Android 快速开发系列 打造万能的ListView GridView 适配器对本篇博客就不会太陌生, base-adapter-helper就是完成类似万能适配器的功能,当然了它本身不支持多种Item布局的情况,我们在下

Androidbaseadapterhelper源码分析与扩展(转载)

Androidbaseadapterhelper源码分析与扩展 转载请标明出处:http://blog.csdn.net/lmj623565791/article/details/44014941,本文出自:[张鸿洋的博客] 本篇博客是我加入Android 开源项目源码解析分析的一篇文章,初次加入,所以选了个比较简单的库,如果你曾经看过Android 快速开发系列 打造万能的ListView GridView 适配器对本篇博客就不会太陌生, base-adapter-helper就是完成类似万能

redis源码分析之事务Transaction(下)

接着上一篇,这篇文章分析一下redis事务操作中multi,exec,discard三个核心命令. 原文地址:http://www.jianshu.com/p/e22615586595 看本篇文章前需要先对上面文章有所了解: redis源码分析之事务Transaction(上) 一.redis事务核心命令简介 redis事务操作核心命令: //用于开启事务 {"multi",multiCommand,1,"sF",0,NULL,0,0,0,0,0}, //用来执行事

Android图片加载库Picasso源码分析

图片加载在Android开发中是非常重要,好的图片加载库也比比皆是.ImageLoader.Picasso.Glide.Fresco均是优秀的图片加载库. 以上提到的几种图片加载库各有特色.用法与比较,网上已经很多了. 出于学习的角度,个人认为从Picasso入手较好.代码量小,同时API优美,很适合我们学习. 今天笔者就Picasso的源码进行分析,抛出一些图片加载的技术细节供园友参考. PS:建议园友先大致看一下源码. 我们对图片加载的要求 1.加载速度要快 2.资源消耗要低 3.加载图片不