Scrapy源码 Response对象

Scrapy源码 Response对象

"""This module implements the Response class which is used to represent HTTPresponses in Scrapy.

See documentation in docs/topics/request-response.rst"""from six.moves.urllib.parse import urljoin

from scrapy.http.request import Requestfrom scrapy.http.headers import Headersfrom scrapy.link import Linkfrom scrapy.utils.trackref import object_reffrom scrapy.http.common import obsolete_setterfrom scrapy.exceptions import NotSupported

class Response(object_ref):

    def __init__(self, url, status=200, headers=None, body=b‘‘, flags=None, request=None):        self.headers = Headers(headers or {})        self.status = int(status)        self._set_body(body)        self._set_url(url)        self.request = request        self.flags = [] if flags is None else list(flags)

    @property    def meta(self):        try:            return self.request.meta        except AttributeError:            raise AttributeError(                "Response.meta not available, this response "                "is not tied to any request"            )

    def _get_url(self):        return self._url

    def _set_url(self, url):        if isinstance(url, str):            self._url = url        else:            raise TypeError(‘%s url must be str, got %s:‘ % (type(self).__name__,                type(url).__name__))

    url = property(_get_url, obsolete_setter(_set_url, ‘url‘))

    def _get_body(self):        return self._body

    def _set_body(self, body):        if body is None:            self._body = b‘‘        elif not isinstance(body, bytes):            raise TypeError(                "Response body must be bytes. "                "If you want to pass unicode body use TextResponse "                "or HtmlResponse.")        else:            self._body = body

    body = property(_get_body, obsolete_setter(_set_body, ‘body‘))

    def __str__(self):        return "<%d %s>" % (self.status, self.url)

    __repr__ = __str__

    def copy(self):        """Return a copy of this Response"""        return self.replace()

    def replace(self, *args, **kwargs):        """Create a new Response with the same attributes except for those        given new values.        """        for x in [‘url‘, ‘status‘, ‘headers‘, ‘body‘, ‘request‘, ‘flags‘]:            kwargs.setdefault(x, getattr(self, x))        cls = kwargs.pop(‘cls‘, self.__class__)        return cls(*args, **kwargs)

    def urljoin(self, url):        """Join this Response‘s url with a possible relative url to form an        absolute interpretation of the latter."""        return urljoin(self.url, url)

    @property    def text(self):        """For subclasses of TextResponse, this will return the body        as text (unicode object in Python 2 and str in Python 3)        """        raise AttributeError("Response content isn‘t text")

    def css(self, *a, **kw):        """Shortcut method implemented only by responses whose content        is text (subclasses of TextResponse).        """        raise NotSupported("Response content isn‘t text")

    def xpath(self, *a, **kw):        """Shortcut method implemented only by responses whose content        is text (subclasses of TextResponse).        """        raise NotSupported("Response content isn‘t text")

    def follow(self, url, callback=None, method=‘GET‘, headers=None, body=None,               cookies=None, meta=None, encoding=‘utf-8‘, priority=0,               dont_filter=False, errback=None, cb_kwargs=None):        # type: (...) -> Request        """        Return a :class:`~.Request` instance to follow a link ``url``.        It accepts the same arguments as ``Request.__init__`` method,        but ``url`` can be a relative URL or a ``scrapy.link.Link`` object,        not only an absolute URL.

        :class:`~.TextResponse` provides a :meth:`~.TextResponse.follow`         method which supports selectors in addition to absolute/relative URLs        and Link objects.        """        if isinstance(url, Link):            url = url.url        elif url is None:            raise ValueError("url can‘t be None")        url = self.urljoin(url)        return Request(url, callback,                       method=method,                       headers=headers,                       body=body,                       cookies=cookies,                       meta=meta,                       encoding=encoding,                       priority=priority,                       dont_filter=dont_filter,                       errback=errback,                       cb_kwargs=cb_kwargs)

原文地址:https://www.cnblogs.com/yinminbo/p/12160241.html

时间: 2024-10-08 20:08:27

Scrapy源码 Response对象的相关文章

Scrapy源码 Request对象

Scrapy源码 Request对象 """This module implements the Request class which is used to represent HTTPrequests in Scrapy. See documentation in docs/topics/request-response.rst"""import sixfrom w3lib.url import safe_url_string from sc

jquery源码--jquery对象

(function( window, undefined ) { // 构造 jQuery 对象 22 var jQuery = (function() { 25 var jQuery = function( selector, context ) { 27 return new jQuery.fn.init( selector, context, root jQuery ); 28 }, // 一堆局部变量声明 97 jQuery.fn = jQuery.prototype = { 98 co

Python源码--整数对象(PyIntObject)的内存池

[背景] 原文链接:http://blog.csdn.net/ordeder/article/details/25343633 Python整数对象是不可变对象,什么意思呢?例如执行如下python语句 >>>a = 1023 >>>a = 1024 >>>b = a >>>c = 1024 >>>d = 195 >>>e = 195 python的整数对象结构为: typedef struct {

Twisted使用和scrapy源码剖析

1.Twisted是用Python实现的基于事件驱动的网络引擎框架. 事件驱动编程是一种编程范式,这里程序的执行流由外部事件来决定.它的特点是包含一个事件循环,当外部事件发生时使用回调机制来触发相应的处理.另外两种常见的编程范式是(单线程)同步以及多线程编程. from twisted.internet import reactor # 事件循环(终止条件,所有的socket都已经移除)from twisted.web.client import getPage # socket对象(如果下载完

Scrapy源码分析-常用的爬虫类-CrawlSpider(三)

CrawlSpider classscrapy.contrib.spiders.CrawlSpider 爬取一般网站常用的spider.其定义了一些规则(rule)来提供跟进link的方便的机制. 也许该spider并不是完全适合您的特定网站或项目,但其对很多情况都使用. 因此您可以以其为起点,根据需求修改部分方法.当然您也可以实现自己的spider.除了从Spider继承过来的(您必须提供的)属性外,其提供了一个新的属性: rules: Rule对象集合.定义了提取需要跟进url的一些规则.

看懂Qt源代码-Qt源码的对象数据存储

第一次看Qt源代码的人都会被其代码所迷惑,经常会看到代码中的d_ptr成员.d_func(函数)和Q_DECLARE_PRIVATE等奇怪的宏,总是让人一头雾水,下面这篇文章转自http://www.qkevin.com/archives/31,它很好的向我们介绍了Qt源代码的编写习惯,为我们看Qt源码打下基础: 对象数据存储 前言,为什么先说这个? 我们知道,在C++中,几乎每一个类(class)中都需要有一些类的成员变量(class member variable),在通常情况下的做法如下:

Python全栈--9.1--面向对象进阶-super 类对象成员--类属性- 私有属性 查找源码类对象步骤 类特殊成员 isinstance issubclass 异常处理

上一篇文章介绍了面向对象基本知识: 面向对象是一种编程方式,此编程方式的实现是基于对 类 和 对象 的使用 类 是一个模板,模板中包装了多个“函数”供使用(可以讲多函数中公用的变量封装到对象中) 对象,根据模板创建的实例(即:对象),实例用于调用被包装在类中的函数 面向对象三大特性:封装.继承和多态 本篇将详细介绍Python 类的成员.成员修饰符.类的特殊成员. 注意点: self ,我们讲过了,self = 对象,实例化后的对象调用类的各种成员的时候的self就是这个对象. 而且我们也讲过了

[python 源码]字符串对象的实现

还是带着问题上路吧,和整数对象的实现同样的问题: >>> a='abc' >>> b='abc' >>> a is b True >>> c='abc'*10 >>> d='abc'*10 >>> d is c False why?在整数对象的实现中,对待小整数有小整数对象池,对待大整数对申请内存,字符串对象的实验也是这样的吗??? NO 先看下字符串对象的定义: typedef struct{ P

java源码剖析: 对象内存布局、JVM锁以及优化

一.目录 1.启蒙知识预热:CAS原理+JVM对象头内存存储结构 2.JVM中锁优化:锁粗化.锁消除.偏向锁.轻量级锁.自旋锁. 3.总结:偏向锁.轻量级锁,重量级锁的优缺点. 二.启蒙知识预热 开启本文之前先介绍2个概念 2.1.cas操作 为了提高性能,JVM很多操作都依赖CAS实现,一种乐观锁的实现.本文锁优化中用到了CAS,故有必要先分析一下CAS的实现. CAS:Compare and Swap. JNI来完成CPU指令的操作: unsafe.compareAndSwapInt(thi