Python 3.x HTTP Error 403: Forbidden

The Fobidden error often raised when using request.open to open some urls.

such as:

url_1 = ‘https://movie.douban.com/subject/26363254/comments?status=P‘
url_2 = ‘https://www.glassdoor.com/Interview/Texas-Instruments-Interview-Questions-E651_P4.htm‘

request.urlopen(url_1) # ** no error raised

request.urlopen(url_2) # ** Fobidden error raised

Here is the reason:

When using urllib.request.urlopen to visit a URL, the server will only receive a simple request for this webpage without knowing the hidden infos about exploer,operating system,platform, which are abnormal.

Some websited will vefify the UserAgent info to prevent the abnoraml visisting.

So the solution : Add these infos to the UserAgent to acts as using exploer to visit

req = request.Request(url,headers={‘User-Agent‘: ‘Mozilla/5.0‘}) # ** this would fix, also you can add other infos to User-Agent

时间: 2024-08-11 15:21:09

Python 3.x HTTP Error 403: Forbidden的相关文章

[Python] urllib2.HTTPError: HTTP Error 403: Forbidden

搬运自http://www.2cto.com/kf/201309/242273.html,感谢原作. 之所以出现上面的异常,是因为如果用 urllib.request.urlopen 方式打开一个URL,服务器端只会收到一个单纯的对于该页面访问的请求.但是服务器并不知道发送这个请求使用的浏览器,操作系统,硬件平台等信息,而缺失这些信息的请求往往都是非正常的访问,例如爬虫.有些网站为了防止这种非正常的访问,会验证请求信息中的UserAgent(它的信息包括硬件平台.系统软件.应用软件和用户个人偏好

urllib.error.HTTPError: HTTP Error 403: Forbidden

问题: urllib.request.urlopen() 方法经常会被用来打开一个网页的源代码,然后会去分析这个页面源代码,但是对于有的网站使用这种方法时会抛出"HTTP Error 403: Forbidden"异常 例如 执行下面的语句时 [python] <span style="font-size:14px;"> urllib.request.urlopen("http://blog.csdn.net/eric_sunah/articl

爬虫403问题解决urllib.error.HTTPError: HTTP Error 403: Forbidden

一.爬虫时,出现urllib.error.HTTPError: HTTP Error 403: Forbidden Traceback (most recent call last):   File "D:/访问web.py", line 75, in <module>     downHtml(url=url)   File "D:/urllib访问web.py", line 44, in downHtml     html=request.urlre

解决git提交问题error: The requested URL returned error: 403 Forbidden while accessing

git提交代码时,出现这个错误"error: The requested URL returned error: 403 Forbidden while accessing https" 解决方法: 编辑.git文件夹下的config文件就可以. vim .git/config #改动对于的配置 #原来的url = https://github.com/elitecodegroovy/PhoenixC.git url = https://[email protected]/elitec

解决github push错误The requested URL returned error: 403 Forbidden while accessing

来源:http://blog.csdn.net/happyteafriends/article/details/11554043 github push错误: [html] view plaincopyprint? git push error: The requested URL returned error: 403 Forbidden while accessing https://github.com/wangz/future.git/info/refs git version 1.7.

Apache error: 403 Forbidden You don&#39;t have permission to access

CentOS 6 solution: chcon -t httpd_sys_content_t -R /directory refer to: https://www.centos.org/forums/viewtopic.php?f=19&t=15128&start=10#p70999 Apache error: 403 Forbidden You don't have permission to access

git推送到github报错:error: The requested URL returned error: 403 Forbidden while accessing https://github.com

最近使用git命令从github克隆仓库到版本,然后进行提交到github时报错如下: [[email protected] git_test]# git push origin mastererror: The requested URL returned error: 403 Forbidden while accessing https://github.com/jsonhc/git_test.git/info/refs fatal: HTTP request failed 解决办法:参考

Python爬虫报错:&quot;HTTP Error 403: Forbidden&quot;

错误原因:主要是由于该网站禁止爬虫导致的,可以在请求加上头信息,伪装成浏览器访问User-Agent. 新增user-agent信息: headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.80 Safari/537.36'} req = request.Request(Spider.url, header

HTTP Error 403: Forbidden

在写网页爬虫的时候,有的网站会有反爬取措施,所以有可能出现上面所示bug 出现bug的地方可能有两处: 1. requests请求时 requests.get(url),返回结果是403. 解决方法: headers= { 'User-Ageent':'一些字符', 'Cookie':'一些字符' } requests.get(url, headers=headers), 此时返回结果应该就是200,正常.加入headers的目的是,模拟人的行为,让服务器认为是人在操作, User-Agent,