以朋友的web作业网站为例:
http://simpledating.sinaapp.com/
使用标准库中的urllib2可以对网页进行访问
GET:
先看看正常发出的GET请求:
代码如下:
#coding:utf-8 import urllib2# 打开Debug Log 方便调试 httpHandler = urllib2.HTTPHandler(debuglevel=1) httpsHandler = urllib2.HTTPSHandler(debuglevel=1) opener = urllib2.build_opener(httpHandler, httpsHandler) urllib2.install_opener(opener) try: header = { ‘Accept‘:‘*/*‘, ‘Accept-Language‘:‘zh-CN,zh;q=0.8‘, ‘Connection‘:‘keep-alive‘, ‘Host‘:‘simpledating.sinaapp.com‘, ‘Origin‘:‘http://simpledating.sinaapp.com‘, ‘Referer‘:‘http://simpledating.sinaapp.com/‘, ‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36 SE 2.X MetaSr 1.0‘, }#添加header req = urllib2.Request( url = ‘http://simpledating.sinaapp.com/‘, headers = header ) response = urllib2.urlopen(req) print response.read() except urllib2.HTTPError, e: print ‘error‘, e.code
返回:
D:\Python\python.exe D:/Users/Administrator/PycharmProjects/web/get.py
send: ‘GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nOrigin: http://simpledating.sinaapp.com\r\nAccept-Language: zh-CN,zh;q=0.8\r\nConnection: close\r\nAccept: */*\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36 SE 2.X MetaSr 1.0\r\nHost:
simpledating.sinaapp.com\r\nReferer: http://simpledating.sinaapp.com/\r\n\r\n‘
reply: ‘HTTP/1.1 200 OK\r\n‘
header: Server: nginx/1.4.4
header: Date: Mon, 12 Jan 2015 06:22:26 GMT
header: Content-Type: text/html; charset=UTF-8
header: Content-Length: 12559
header: Connection: close
header: Etag: "1688e856917006e409c1932e75e871bf53b2315c"
header: via: yq34.pyruntime
header: Set-Cookie: saeut=CkMPIlSzaCIY9kkJCYxEAg==; expires=Thu, 31-Dec-37 23:55:55 GMT; path=/
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
...
POST:
先看看正常发出的post请求:
可易看到post数据格式是有三个,哦这个网站还加了xsrf(跨站请求伪造),防止伪造的请求..
那我只好在请求data和cookies也伪造下吧
代码如下:
#coding:utf-8 import urllib import urllib2 # 打开Debug Log 方便调试 httpHandler = urllib2.HTTPHandler(debuglevel=1) httpsHandler = urllib2.HTTPSHandler(debuglevel=1) opener = urllib2.build_opener(httpHandler, httpsHandler) urllib2.install_opener(opener) try: #data是要提交的数据 按照他要求的格式填 data = {u‘_xsrf‘:u‘f65fb8fdd4134e1f815c7a10f37561f6‘,u‘place‘:u‘~‘, u‘content‘:u‘asdwqwt‘, u‘time‘:u‘9999/99/99 ab:cd‘} data = urllib.urlencode(data) header = { ‘Accept‘:‘*/*‘, # ‘Accept-Encoding‘:‘gzip,deflate,sdch‘, ‘Accept-Language‘:‘zh-CN,zh;q=0.8‘, ‘Connection‘:‘keep-alive‘, ‘Cookie‘:‘saeut=CkMPGlSn3wdyh1mEHBWpAg==; user=eGlvbmdiaWFv|1421041643|f368e242dd53c65621ee754688042ae8898c572b;_xsrf=f65fb8fdd4134e1f815c7a10f37561f6‘, ‘Host‘:‘simpledating.sinaapp.com‘, ‘Origin‘:‘http://simpledating.sinaapp.com‘, ‘Referer‘:‘http://simpledating.sinaapp.com/createBroadDating‘, ‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36 SE 2.X MetaSr 1.0‘, # post要加入的: ‘Content-Type‘:‘application/x-www-form-urlencoded; charset=UTF-8‘, ‘Content-Length‘:‘92‘, ‘X-Requested-With‘:‘XMLHttpRequest‘, } req = urllib2.Request( url = ‘http://simpledating.sinaapp.com/createBroadDating‘, data = data, headers = header ) response = urllib2.urlopen(req) print response.read() except urllib2.HTTPError, e: print ‘error‘, e.code
返回结果如下:
D:\Python\python.exe D:/Users/Administrator/PycharmProjects/web/web.py
send: ‘POST /createBroadDating HTTP/1.1\r\nAccept-Encoding: identity\r\nOrigin: http://simpledating.sinaapp.com\r\nContent-Length: 92\r\nAccept-Language: zh-CN,zh;q=0.8\r\nConnection: close\r\nAccept: */*\r\nUser-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36 SE
2.X MetaSr 1.0\r\nHost: simpledating.sinaapp.com\r\nX-Requested-With: XMLHttpRequest\r\nCookie: saeut=CkMPGlSn3wdyh1mEHBWpAg==; user=eGlvbmdiaWFv|1421041643|f368e242dd53c65621ee754688042ae8898c572b;_xsrf=f65fb8fdd4134e1f815c7a10f37561f6\r\nReferer:http://simpledating.sinaapp.com/createBroadDating\r\nContent-Type:
application/x-www-form-urlencoded; charset=UTF-8\r\n\r\ncontent=asdwqwt&_xsrf=f65fb8fdd4134e1f815c7a10f37561f6&place=%7E&time=9999%2F99%2F99+ab%3Acd‘
reply: ‘HTTP/1.1 200 OK\r\n‘
header: Server: nginx/1.4.4
header: Date: Mon, 12 Jan 2015 06:49:43 GMT
header: Content-Type: text/html; charset=UTF-8
header: Content-Length: 7
header: Connection: close
header: via: yq34.pyruntime
success
可见post成功了