urllib.request

【urllib.request】

1、urlopen结果保存在内存。

2、ulrretrieve结果保存到文件。

3、response有read方法。

4、可以创建Request对象。

5、发送Post数据，需要encode()成ascii的byte.

6、url中加入query

7、加入User-Agent参数。

8、错误。

　　urlopen raises URLError when it cannot handle a response (though as usual with Python APIs, built-in exceptions such as ValueError, TypeError etc. may also be raised).

　　HTTPError is the subclass of URLError raised in the specific case of HTTP URLs.

　　The exception classes are exported from the urllib.error module.

9、http.server.BaseHTTPRequestHandler.responses is a useful dictionary of response codes in that shows all the response codes used by RFC 2616. The dictionary is reproduced here for convenience

　　包含所有的HTTP错误码。

时间： 2024-10-10 10:51:31

urllib.request的相关文章

通过python的urllib.request库来爬取一只猫

我们实验的网站很简单,就是一个关于猫的图片的网站:http://placekitten.com 代码如下: import urllib.request respond = urllib.request.urlopen("http://placekitten.com.s3.amazonaws.com/homepage-samples/200/287.jpg") cat_img = respond.read() f = open('cat_200_300.jpg','wb') f.writ

在python3中使用urllib.request编写简单的网络爬虫

Python官方提供了用于编写网络爬虫的包 urllib.request, 我们主要用它进行打开url,读取url里面的内容,下载里面的图片. 分以下几步: step1:用urllib.request.urlopen打开目标网站 step2:由于urllib.request.urlopen返回的是一个http.client.HTTPResponse object,无法直接读取里面的内容,所以直接调用该对象的方法read(),获取到页面代码,存到html里 step3:构建正则表达式,从页面代码里

python3中urllib.request.urlopen.read读取的网页格式问题

#!/usr/bin/env python3 #-*- coding: utf-8 -*- #<a title="" target="_blank" href="http://blog.sina.com.cn/s/blog_4701280b0102eo83.html"><论电影的七个元素>——关于我对电…</a> import urllib.request str0 =r' <a title="

2. Python标准库urllib.request模块_2(python3)

参考学习地址:http://www.iplaypython.com # coding:utf-8 # 学习1 import urllib.request # print(dir(html)) # 获取网页所在的header信息 url="http://www.iplaypython.com/" html=urllib.request.urlopen(url) # 获取网站返回的状态码 code = html.getcode() print("返回的状态码: %s"

Python 3.X 要使用urllib.request 来抓取网络资源。转

Python 3.X 要使用urllib.request 来抓取网络资源. 最简单的方式: #coding=utf-8 import urllib.request response = urllib.request.urlopen('http://python.org/') buff = response.read() #显示 html = buff.decode("utf8") response.close() print(html) 使用Request的方式: #coding=ut

爬虫初探(1)之urllib.request

-----------我是小白------------ urllib.request是python3自带的库(python3.x版本特有),我们用它来请求网页,并获取网页源码. # 导入使用库 import urllib.request url = "http://www.baidu.com" # urlopen用来打开一个网页 data = urllib.request.urlopen(url) # 这里的rend()是必须的,否则不能打印源码. data = data.read()

爬虫小探-Python3 urllib.request获取页面数据

使用Python3 urllib.request中的Requests()和urlopen()方法获取页面源码,并用re正则进行正则匹配查找需要的数据. #forex.py#coding:utf-8 ''' urllib.request.urlopen() function in Python 3 is equivalent to urllib2.urlopen() in Python2 urllib.request.Request() function in Python 3 is equiva

Urllib.request用法简单介绍(Python3.3)

Urllib.request用法简单介绍(Python3.3),有需要的朋友可以参考下. urllib是Python标准库的一部分,包含urllib.request,urllib.error,urllib.parse,urlli.robotparser四个子模块,这里主要介绍urllib.request的一些简单用法. 首先是urlopen函数,用于打开一个URL: # -*- coding:utf-8 -*- #获取并打印google首页的htmlimport urllib.requestre

关于python3.X 报"import urllib.request ImportError: No module named request"错误,解决办法

#encoding:UTF-8 import urllib.request url = "http://www.baidu.com" data = urllib.request.urlopen(url).read() data = data.decode('UTF-8') print(data) 报错:import urllib.request ImportError: No module named request 解决办法: #encoding:UTF-8 import urlli