python下载文件的三种方法

Python开发中时长遇到要下载文件的情况，最常用的方法就是通过Http利用urllib或者urllib2模块。

当然你也可以利用ftplib从ftp站点下载文件。此外Python还提供了另外一种方法requests。

下面来看看三种方法是如何来下载zip文件的：
方法一：

import urllib
import urllib2
import requests
print "downloading with urllib"
url = ‘http://***/test/demo.zip‘
print "downloading with urllib"
urllib.urlretrieve(url, "demo.zip")

方法二：

import urllib2
print "downloading with urllib2"
url = ‘http://***/test/demo.zip‘
f = urllib2.urlopen(url)
data = f.read()
with open("demo2.zip", "wb") as code:
code.write(data)

方法三：

import requests
print "downloading with requests"
url = ‘http://***/test/demo.zip‘
r = requests.get(url)
with open("demo3.zip", "wb") as code:
code.write(r.content)

看起来使用urllib最为简单，一句语句即可。当然你可以把urllib2缩写成：

f = urllib2.urlopen(url)
with open("demo2.zip", "wb") as code:
code.write(f.read())

================================================

python requests 最近在研究这个有研究的可以加我QQ29295842

在HTTP相关处理中使用python是不必要的麻烦，这包括urllib2模块以巨大的复杂性代价获取综合性的功能。相比于urllib2,Kenneth Reitz的Requests模块更能简约的支持完整的简单用例。

简单的例子：
想象下我们试图使用get方法从http://example.test/获取资源并且查看返回代码，content-type头信息，还有response的主体内容。这件事无论使用urllib2 或者Requests都是很容易实现的。
urllib2

[python] view plaincopy

>>> import urllib2

>>> url = ‘http://example.test/‘

>>> response = urllib2.urlopen(url)

>>> response.getcode()

200

>>> response.headers.getheader(‘content-type‘)

‘text/html; charset=utf-8‘

>>> response.read()

‘Hello, world!‘

Requests

[plain] view plaincopy

>>> import requests

>>> url = ‘http://example.test/‘

>>> response = requests.get(url)

>>> response.status_code

200

>>> response.headers[‘content-type‘]

‘text/html; charset=utf-8‘

>>> response.content

u‘Hello, world!‘

这两种方法很相似，相对于urllib2调用方法读取response中的属性信息，Requests则是使用属性名来获取对应的属性值。
两者还有两个细微但是很重要的差别：

1 Requests 自动的把返回信息有Unicode解码

2 Requests 自动保存了返回内容，所以你可以读取多次，而不像urllib2.urlopen()那样返回的只是一个类似文件类型只能读取一次的对象。

第二点是在python交互式环境下操作代码很令人讨厌的事情

一个复杂一点的例子：现在让我们尝试下复杂点得例子：使用GET方法获取http://foo.test/secret的资源，这次需要基本的http验证。使用上面的代码作为模板，好像我们只要把urllib2.urlopen() 到requests.get()之间的代码换成可以发送username，password的请求就行了

这是urllib2的方法：

[python] view plaincopy

>>> import urllib2

>>> url = ‘http://example.test/secret‘

>>> password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()

>>> password_manager.add_password(None, url, ‘dan‘, ‘h0tdish‘)

>>> auth_handler = urllib2.HTTPBasicAuthHandler(password_manager)

>>> opener = urllib2.build_opener(auth_handler)

>>> urllib2.install_opener(opener)

>>> response = urllib2.urlopen(url)

>>> response.getcode()

200

>>> response.read()

‘Welcome to the secret page!‘

一个简单的方法中实例化了2个类，然后组建了第三个类，最后还要装载到全局的urllib2模块中，最后才调用了urlopen，那么那两个复杂的类是什么的

迷惑了吗，这里所有urllib2的文档 http://docs.python.org/release/2.7/library/urllib2.html

那Requests是怎么样解决同样的问题的呢？

Requests

[plain] view plaincopy

>>> import requests

>>> url = ‘http://example.test/secret‘

>>> response = requests.get(url, auth=(‘dan‘, ‘h0tdish‘))

>>> response.status_code

200

>>> response.content

u‘Welcome to the secret page!‘

只是在调用方法的时候增加了一个auth关键字函数
我敢打赌你不用查文档也能记住。

错误处理 Error HandlingRequests 对错误的处理也是很非常方面。如果你使用了不正确的用户名和密码，urllib2会引发一个urllib2.URLError错误，然而Requests 会像你期望的那样返回一个正常的response对象。只需查看response.ok的布尔值便可以知道是否登陆成功。

[python] view plaincopy

>>> response = requests.get(url, auth=(‘dan‘, ‘wrongPass‘))

>>> response.ok

False

其他的一些特性：
* Requests对于HEAD, POST, PUT, PATCH, 和 DELETE方法的api同样简单
* 它可以处理多部分上传，同样支持自动转码
* 文档更好
* 还有更多

Requests 是很好的，下次需要使用HTTP时候可以试试。

时间： 2024-10-01 23:00:37

python下载文件的三种方法

python下载文件的三种方法的相关文章

pdf文件怎么编辑如何编辑pdf文件的三种方法

Python下载网页的几种方法

ubuntu/linux mint 创建proc文件的三种方法（二）

ubuntu/linux mint 创建proc文件的三种方法（四）

ubuntu/linux mint 创建proc文件的三种方法（一）

VC6.0加载lib文件的三种方法

PHP下载文件的两种方法

ubuntu/linux mint 创建proc文件的三种方法（三）

java将doc文件转换为pdf文件的三种方法