简单爬虫直接diy, 复杂的用scrapy
import urllib2 import re from bs4 import BeautifulSoap req = urllib2.Request(url, headers={‘User-Agent‘ : "Magic Browser"}) webpage= urllib2.urlopen(req) soap = BeautifulSoap(webpage.read()) ...
时间: 2024-10-11 16:43:16
简单爬虫直接diy, 复杂的用scrapy
import urllib2 import re from bs4 import BeautifulSoap req = urllib2.Request(url, headers={‘User-Agent‘ : "Magic Browser"}) webpage= urllib2.urlopen(req) soap = BeautifulSoap(webpage.read()) ...