Python下载网页图片

#coding:utf-8
import requests
from bs4 import BeautifulSoup
import re
DownPath = "/jiaoben/python/meizitu/pic/"
import urllib
head = {‘User-Agent‘:‘Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6‘}
TimeOut = 5
PhotoName = 0
c = ‘.jpeg‘
PWD="/jiaoben/python/meizitu/pic/"
for x in range(1,4):
  site = "http://www.meizitu.com/a/qingchun_3_%d.html" %x
  Page = requests.session().get(site,headers=head,timeout=TimeOut)
  Coding =  (Page.encoding)
  Content = Page.content#.decode(Coding).encode(‘utf-8‘)
  ContentSoup = BeautifulSoup(Content)
  jpg = ContentSoup.find_all(‘img‘,{‘class‘:‘scrollLoading‘})
  for photo in jpg:
    PhotoAdd = photo.get(‘data-original‘)
    PhotoName +=1
    Name =  (str(PhotoName)+c)
    r = requests.get(PhotoAdd,stream=True)
    with open(PWD+Name, ‘wb‘) as fd:
        for chunk in r.iter_content():
                fd.write(chunk)
print ("You have down %d photos" %PhotoName)

# -*- coding:utf-8 -*-
import urllib.request
path = "D:\\Download"
url = "http://pic2.sc.chinaz.com/files/pic/pic9/201309/apic520.jpg"
name ="D:\\download\\1.jpg"
#保存文件时候注意类型要匹配，如要保存的图片为jpg，则打开的文件的名称必须是jpg格式，否则会产生无效图片
conn = urllib.request.urlopen(url)
f = open(name,‘wb‘)
f.write(conn.read())
f.close()
print(‘Pic Saved!‘)

很简单，打开个url链接，然后save到某个文件夹下就可以了。

有时候不如不想输入路径，那就需要用os模块来修改当前路径

os.chdir("D:\\download")
os.getcwd()

这样保存的文件就只需要名字就可以了

f = open(‘1.jpg‘,‘wb‘)

这上面的url是给定的，只能下载一张图片，如果要是批量下载，就需要用循环来判断不同的url，

下面是从其他地方看到的一个例子，就是把图片url中的图片名字修改，然后就可以循环保存了，不过也是先确定了某个url

来源：http://www.oschina.net/code/snippet_1016509_21961 开源中国社区，自己修改的地方是提出了相同代码def了个函数

import os
import urllib.request
def rename(name):
    if len(name) == 2:
        name = ‘0‘ + name + ‘.jpg‘
    elif len(name) == 1:
        name = ‘00‘ + name + ‘.jpg‘
    else:
        name = name + ‘.jpg‘
    return name  

os.chdir("D:\\download")
os.getcwd()
count = 1
name=str(count)
name = rename(name)
print(name)
url = ‘http://bgimg1.meimei22.com/list/2012-5-24/2/sa‘ + name
while count < 15:
    a = urllib.request.urlopen(url)
    f = open(name, "wb")
    f.write(a.read())
    f.close()
    print(url + ‘ Saved!‘)
    count = count + 1
    name=str(count)
    name = rename(name)
    print(name)
    url = ‘http://bgimg1.meimei22.com/list/2012-5-24/2/sa‘ + name
    try:
        a = urllib.request.urlopen(url)
        pass
    except (Exception) as e:
        print(e)
    else:
        pass
else:
    print(url + ‘ not found‘)

当然也可以自己建立http连接，然后动态获取.jpg的图片

url = "desk.zol.com.cn"
conn = http.client.HTTPConnection(url)
conn.request("GET", "/dongman/")
r = conn.getresponse()
print (r.status, r.reason)
data1 = r.read()#.decode(‘utf-8‘) #编码根据实际情况酌情处理

开始时候写的老是提示目标计算机主动拒绝，后来才发现我选的函数是HTTPSConnection() ，当然会被拒绝了，这一点应该注意，要选择HTTPConnection()

时间： 2024-10-20 20:54:53

Python下载网页图片的相关文章

Python下载网页的几种方法

get和post方式总结 get方式:以URL字串本身传递数据参数,在服务器端可以从'QUERY_STRING'这个变量中直接读取,效率较高,但缺乏安全性,也无法来处理复杂的数据(只能是字符串,比如在servlet/jsp中就无法处理发挥java的比如vector之类的功能). post方式:就传输方式讲参数会被打包在数据报中传输,从CONTENT_LENGTH这个环境变量中读取,便于传送较大一些的数据,同时因为不暴露数据在浏览器的地址栏中,安全性相对较高,但这样的处理效率会受到影响. get

Python爬虫网页图片

一概述参考http://www.cnblogs.com/abelsu/p/4540711.html 弄了个Python捉取单一网页的图片,但是Python已经升到3+版本了.参考的已经失效,基本用不上.修改了下,重新实现网页图片捉取. 二代码 #coding=utf-8 #urllib模块提供了读取Web页面数据的接口 import urllib #re模块主要包含了正则表达式 import re import urllib.parse import urllib.request #定义一

c# 下载网页图片

也是比较老的东西了最近用到记录下以免以后忘了要下载图片首先要有图片地址要有图片地址就要先把网页下下来分析下URL 下载网页一般用两种方法 1,用 system.net.webclient using System.Net; using System.Windows.Forms; string url = "http://www.cnblogs.com"; string result = null; try { WebClient client = new WebClient()

python爬虫.3.下载网页图片

目标,豆瓣读书, 下载页面书籍图片. import urllib.request import re #使用正则表达式 def getJpg(date): jpgList = re.findall(r'(img src="http.+?.jpg")([\s\S]*?)(.+?.alt=".+?.")',date) return jpgList def downLoad(jpgUrl,sTitle,n): try: urllib.request.urlretrieve

python多线程下载网页图片并保存至特定目录

#!python3 #multidownloadXkcd.py - Download XKCD comics using multiple threads. import requests import bs4 import os import threading # os.mkdir('xkcd', exist_ok=True) # store comics in ./xkcd if os.path.exists('xkcd'): print("xkcd is existed!")

python下载网页

1.安装pip 下载pip的安装包get-pip.py,下载地址:https://pip.pypa.io/en/latest/installing.html#id7 然后在get-pip.py所在的目录下运行get-pip.py 执行完成后,在python的安装目录下的Scripts子目录下,可以看到pip.exe 升级的话用 python -m pip install -U pip

shell脚本下载网页图片

和大家分享一个shell脚本写的图片抓取器.使用方法:img_downloader.sh.使用时在shell下输入:./img_downloader.sh www.baidu.com -d images该shell脚本就会把百度首页上的图片下载下来. 代码: #!/bin/bash if [ $# -ne 3 ]; then echo "Usage: $0 URL -d DIRECTORY" exit -1 fi for i in {1..4} do case $1 in -d) sh

python下载网页源码写入文本

import urllib.request,io,os,sysreq=urllib.request.Request("http://echophp.sinaapp.com/uncategorized/194.html")f=urllib.request.urlopen(req)s=f.read()s=s.decode('utf-8','ignore')mdir=sys.path[0]+'/'file=open(mdir+'html.txt','a',1,'gbk')file.write

python 获取网页图片保存在本地

import urllib import string import re def getHtml(url): page=urllib.urlopen(url) html=page.read() return html def getPic(html): imgre=re.compile(r'src=".+?\.jpg" data-big-img') imglist=re.findall(imgre,html) print imglist x=0 for imgurl in imgli