python2 urllib 笔记

import urllib

base=‘http://httpbin.org/‘
ip=base+‘ip‘
r=urllib.urlopen(ip)
print r.geturl()
print r.read()

#get
get=base+"get"
parms=urllib.urlencode({"name":"tom","age":18})
r=urllib.urlopen("%s?%s"%(get,parms))
print r.geturl()
print r.read()

#post
post=base+"post"
parms=urllib.urlencode({"name":"tom","age":18})
r=urllib.urlopen(post,parms)
print r.geturl()
print r.read()

#代理请求
proxies = {‘http‘: ‘http://proxy.example.com:8080/‘}
opener = urllib.FancyURLopener(proxies)
f = opener.open("http://www.python.org")
f.read()

#下载网页数据
#urllib.urlretrieve()

文件和网页下载

‘‘‘
Created on 2014年9月18日

@author: cocoajin

文件下载程序

‘‘‘

import urllib
import urlparse

qihu360=‘http://dl.360safe.com/mac/safe/360InternetSecurity_1.0.75.dmg‘
gitRF=‘http://gitref.org/zh/index.html‘

url=qihu360

#截取文件名，并设置保存路径为桌面
desk=‘/Users/teso/Desktop/‘
up=urlparse.urlsplit(url)
fname=up.path.split(‘/‘)[-1]
path=desk+fname

#下载回调
def showDN(dataNums,oneData,totalData):
    ‘‘‘
    在下载过程之中的回调函数，回调下载的进度
    dataNums:已下载的数据块
    oneData:一个数据块的大小
    totalData:总共的数据量
    ‘‘‘
    download=100.0*dataNums*oneData/totalData
    if download >= 100:
        download=100.0
        print ‘download finished‘

    print ‘downloading %.2f%% ‘ % (download)

re=urllib.urlretrieve(url, path,showDN)
print re

时间： 2024-10-08 01:56:50

python2 urllib 笔记的相关文章

python2 httplib 笔记

python2 httplib 笔记 #coding=utf-8 ''' Created on 2014年9月25日 @author: cocoajin ''' import httplib,urllib base='httpbin.org' #不需要添加 "http://" con=httplib.HTTPConnection(base) ip = '/ip' con.request('GET',ip) re=con.getresponse() print re.getheader

Effective Python2 读书笔记1

Item 2: Follow the PEP 8 Style Guide Naming Naming functions, variables, attributes lowercase_underscore protected instance attributes _leading_underscore private instance attributes __double_leading_underscore classes, exceptions CapitalizedWord mod

回味Python2.7——笔记4

一.Python 标准库概览 1.操作系统接口 os 模块提供了很多与操作系统交互的函数: >>> import os >>> os.getcwd() # Return the current working directory 'C:\\Python27' >>> os.chdir('/server/accesslogs') # Change current working directory >>> os.system('mkdi

回味Python2.7——笔记3

一.错误和异常 1.异常处理 >>> while True: ... try: ... x = int(raw_input("Please enter a number: ")) ... break ... except ValueError: ... print "Oops! That was no valid number. Try again..." ... try 语句按如下方式工作: 首先,执行 try 子句 (在 try 和 excep

Effective Python2 读书笔记2

Item 14: Prefer Exceptions to Returning None Functions that returns None to indicate special meaning are error prone because None and other values (e.g., zero, the empty string) all evaluate to False in conditional expressions. Raise exceptions to in

回味Python2.7——笔记2

一.模块模块是包括 Python 定义和声明的文件.文件名就是模块名加上 .py 后缀.模块的模块名(做为一个字符串)可以由全局变量 __name__ 得到. 1. 模块可以导入其他的模块. 一个(好的)习惯是将所有的 import 语句放在模块的开始(或者是脚本),这并非强制. 被导入的模块名会放入当前模块的全局符号表中. from fibo import * :这样可以导入所有除了以下划线( _ )开头的命名. 需要注意的是在实践中往往不鼓励从一个模块或包中使用 * 导入所有,因为这样会让

爬虫小探-Python3 urllib.request获取页面数据

使用Python3 urllib.request中的Requests()和urlopen()方法获取页面源码,并用re正则进行正则匹配查找需要的数据. #forex.py#coding:utf-8 ''' urllib.request.urlopen() function in Python 3 is equivalent to urllib2.urlopen() in Python2 urllib.request.Request() function in Python 3 is equiva

爬虫页面

9.31 爬取百度 import urllib.request response=urllib.request.urlopen('http://www.baidu.com')print(response.read().decode('utf-8')) 爬取 10.1 主动提交url2.设置友情链接3.百度会和DNS服务商务合作,抓取新页面?爬取步骤1.给一个url2.写程序,模拟浏览器访问url3.解析内容,提取数据使用库 urllib\requests\bs4解析网页正则表达式\bs4\

Python爬虫连载1-urllib.request和chardet包使用方式

一.参考资料 1.<Python网络数据采集>图灵工业出版社 2.<精通Python爬虫框架Scrapy>人民邮电出版社 3.[Scrapy官方教程](http://scrapy-chs.readthedocs.io/zh_CN/0.24/intro/tutorial.html) 4.[Python3网络爬虫](http://blog.csdn.net/c406495762/article/details/72858983 二.前提知识 url.http协议.web前端:html\