phantomJS是一款无头浏览器, 之前我们通过selenium操作PhantomJS来完成动态加载数据的加载,
现在PhantomJS已经停止更新, 不过可以使用谷歌浏览器的无头浏览器来代替PhantomJS来完成上述操作
使用谷歌无头浏览器的实例代码如下:
from selenium import webdriver from selenium.webdriver.chrome.options import Options from time import sleep chrome_options = Options() chrome_options.add_argument(‘--headless‘) chrome_options.add_argument(‘--disable-gpu‘) bro = webdriver.Chrome(chrome_options=chrome_options) bro.get(‘https://www.baidu.com‘) sleep(3) print(bro.page_source) bro.save_screenshot(‘1.png‘) bro.quit()
执行下拉滚动条操作
from selenium import webdriver from selenium.webdriver.chrome.options import Options import time chrome_options = Options() chrome_options.add_argument(‘--headless‘) chrome_options.add_argument(‘--disable-gpu‘) bro = webdriver.Chrome(chrome_options=chrome_options) bro.get(url=‘https://movie.douban.com/typerank?type_name=%E7%88%B1%E6%83%85&type=13&interval_id=100:90&action=‘) time.sleep(3) bro.save_screenshot(‘baidu/aiqing.png‘) #让bro直行简单的js代码,模拟滚到到底部 js = ‘window.scrollBy(500,100000)‘ bro.execute_script(js) time.sleep(3) bro.save_screenshot(‘baidu/aiqing2.png‘) #获取网页代码,保存到文件中 html = bro.page_source with open(‘douban.html‘,‘w‘,encoding=‘utf8‘) as f: f.write(html) bro.quit()
通过selenium加上下拉滚动条抓取懒加载图片
from selenium import webdriver from selenium.webdriver.chrome.options import Options import time chrome_options = Options() chrome_options.add_argument(‘--headless‘) chrome_options.add_argument(‘--disable-gpu‘) bro = webdriver.Chrome(chrome_options=chrome_options) bro.get(url=‘http://sc.chinaz.com/tupian/ribenmeinv.html‘) time.sleep(2) with open(‘lanjiazai.html‘, ‘w‘, encoding=‘utf8‘) as f: f.write(bro.page_source) # bro.save_screenshot(‘lanjiazai.png‘) bro.execute_script(‘window.scrollBy(0,10000)‘) time.sleep(3) with open(‘lanjiazai2.html‘, ‘w‘, encoding=‘utf8‘) as f: f.write(bro.page_source) # bro.save_screenshot(‘lanjiazai2.png‘) time.sleep(1) bro.close()
原文地址:https://www.cnblogs.com/zhangjian0092/p/11407618.html
时间: 2024-10-11 20:14:36