Python爬取中国天气网天气

基于requests库制作的爬虫。

使用方法：打开终端输入 “python3 weather.py 北京(或你所在的城市)"

程序正常运行需要在同文件夹下加入一个“data.csv”文件，内容请参考链接：https://www.cnblogs.com/Rhythm-/p/9255190.html

运行效果：

源码：

import sys
import re
import requests
import webbrowser
from PIL import Image
from requests.exceptions import RequestException
import csv
data={}
with open("data.csv",‘r‘) as f:
    rawinfos=list(csv.reader(f))
    for i in rawinfos:
        data[i[0]]=i[1]
def get_one_page(url,headers):
    try:
        response=requests.get(url,headers=headers)
        if response.status_code==200:
            response.encoding=‘utf-8‘
            return response.text
        return None
    except RequestException:
        return None
headers={‘User-Agent‘: ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/604.4.7 (KHTML, like Gecko) Version/11.0.2 Safari/604.4.7‘}
try:
    address=data[sys.argv[1]]
except:
    sys.exit("\033[31m无该城市！\033[0m")
html=get_one_page(‘http://www.weather.com.cn/weather1d/‘+address+‘.shtml‘,headers)
if not html:
    print("城市代码有误！")
    exit(1)
ADDRESS=re.findall(‘<title>(.*?)</title>‘,html)
aim=re.findall(‘<input type="hidden" id="hidden_title" value="(.*?)月(.*?)日(.*?)时(.*?) (.*?)  (.*?)  (.*?)"‘,html,re.S)
airdata=re.findall(‘<li class="li6 hot">\n<i></i>\n<span>(.*?)</span>\n<em>(.*?)</em>\n<p>(.*?)</p>\n</li>‘,html,re.S)
print(ADDRESS[0][1:5])
print("当前日期：%s月%s日,%s"%(aim[0][0],aim[0][1],aim[0][4]))
print("更新时间：%s:00"%aim[0][2])
print("当前天气：%s"%aim[0][5])
print("今日温度：%s"%aim[0][6])
print("空气质量："+airdata[0][0]+","+airdata[0][2])
ask_ok=input("是否深入查看（Y/N）：")
if ask_ok==‘Y‘ or ask_ok==‘y‘:
    lightdata=re.findall(‘<li class="li1 hot">\n<i></i>\n<span>(.*?)</span>\n<em>(.*?)</em>\n<p>(.*?)</p>\n</li>‘,html,re.S)
    colddata=re.findall(‘<li class="li2 hot">\n(.*?)</span>\n<em>(.*?)</em>\n<p>(.*?)</p>‘,html,re.S)
    weardata=re.findall(‘<li class="li3 hot" id="chuanyi">\n(.*?)<span>(.*?)</span>\n<em>(.*?)</em>\n<p>(.*?)</p>‘,html,re.S)
    washdata=re.findall(‘<li class="li4 hot">\n<i></i>\n<span>(.*?)</span>\n<em>(.*?)</em>\n<p>(.*?)</p>\n</li>‘,html,re.S)
    bloodata=re.findall(‘<li class="li5 hot">\n<i></i>\n<span>(.*?)</span>\n<em>(.*?)</em>\n<p>(.*?)</p>\n</li>‘,html,re.S)
    detail = re.findall(‘hour3data={"1d":(.*?),"23d"‘, html, re.S)
    detail = re.findall(‘"(.*?)"‘, detail[0], re.S)
    print("--"*40)
    print(‘详细数据：‘)
    print("%-10s\t%-10s\t%-10s\t%-10s\t%-10s"%("时间","状态","温度","风向","风力"))
    for each in detail:
        each=each.split(‘,‘)
        print("%-10s\t%-10s\t%-10s\t%-10s\t%-10s"%(each[0],each[2],each[3],each[4],each[5]))
    print("--"*40)
    print("%s:\t%s\t%s"%(lightdata[0][1],lightdata[0][0],lightdata[0][2]))
    print("%s:\t%s"%(colddata[0][1],colddata[0][2]))
    print("%s:\t%s\t%s"%(washdata[0][1],washdata[0][0],washdata[0][2]))
    print("血糖指数:\t%s,%s"%(bloodata[0][0],bloodata[0][2]))
    print("%s:\t%s\t%s"%(weardata[0][2],weardata[0][1],weardata[0][3]))
    print("--"*40)
    flag=input("是否查看详细穿衣建议（Y/N）：")
    if flag==‘Y‘ or flag==‘y‘:
        webbrowser.open("http://www.weather.com.cn/forecast/ct.shtml?areaid="+address)
print("数据来源：中央气象台")

原文地址：https://www.cnblogs.com/Rhythm-/p/9255255.html

时间： 2024-10-01 19:27:51

Python爬取中国天气网天气的相关文章

python爬取中国知网部分论文信息

爬取指定主题的论文,并以相关度排序. 1 #!/usr/bin/python3 2 # -*- coding: utf-8 -*- 3 import requests 4 import linecache 5 import random 6 from bs4 import BeautifulSoup 7 8 if __name__=="__main__": 9 keywords='通信' ### 查询的主题 10 n=0 11 target='http://search.cnki.ne

python爬取中国大学排名

教程来自:[Python网络爬虫与信息提取].MOOC. 北京理工大学目标:爬取最好大学网前50名大学代码如下: import requests from bs4 import BeautifulSoup import bs4 def getHTMLText(url): try: r = requests.get(url,timeout = 30) r.raise_for_status() r.encoding = r.apparent_encoding return r.text exce

第一篇博客（python爬取小故事网并写入mysql）

前言: 这是一篇来自整理EVERNOTE的笔记所产生的小博客,实现功能主要为用广度优先算法爬取小故事网,爬满100个链接并写入mysql,虽然CS作为双学位已经修习了三年多了,但不仅理论知识一般,动手能力也很差,在学习的空余时间前前后后DEBUG了很多次,下面给出源代码及所遇到的BUG. 本博客参照代码及PROJECT来源:http://kexue.fm/archives/4385/ 源代码: 1 import requests as rq 2 import re 3 import codecs

Python 爬取煎蛋网妹子图片

1 #!/usr/bin/env python 2 # -*- coding: utf-8 -*- 3 # @Date : 2017-08-24 10:17:28 4 # @Author : EnderZhou ([email protected]) 5 # @Link : http://www.cnblogs.com/enderzhou/ 6 # @Version : $Id$ 7 8 import requests 9 from bs4 import BeautifulSoup as bs

使用Python爬取煎蛋网妹纸图片

import urllib.request import os import os.path import re def dir(dir_name="images"): """设定图片保存目录,基于当前程序运行目录""" if os.path.isdir(dir_name): os.chdir(dir_name) else: os.mkdir(dir_name)

Python爬取17吉他网吉他谱

最近学习吉他,一张一张保存吉他谱太麻烦,写个小程序下载吉他谱. 安装 BeautifulSoup,BeautifulSoup是一个解析HTML的库.pip install BeautifulSoup4 在这个程序中 BeautifulSoup 使用 html5lib 所以还要安装 html5libpip install html5lib 代码如下: # -*- coding: utf-8 -*- #coding=UTF8 import os import sys import logging i

python 爬取煎蛋网图片

__author__ = mkdir(path): os path = path.strip() path = path.rstrip() mkfile = os.path.exists(path) mkfile: () : os.makedirs(path) () urllib, urllib2, re geturl(url): file_lists = [] req = urllib2.Req

python爬取煎蛋网图片

py2版本: #-*- coding:utf-8 -*-#from __future__ import unicode_literimport urllib,urllib2,timeimport re,sys,osheaders={'Referer':'http://jandan.net/','User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2

【Python3 爬虫】U11_爬取中国天气网

目录 1.网页分析 2.代码实现 1.网页分析庚子年初,各种大事件不期而至,又赶上最近气温突变,所以写个爬虫来爬取下中国天气网,并通过图表反映气温最低的前20个城市. 中国天气网:http://www.weather.com.cn/textFC/hb.shtml 打开后如下图: 从图中可以看到所有城市按照地区划分了,并且每个城市都有最低气温和最高气温,通过chrome查看Elements,如下: 从上图可以看到展示当天的数据,那么<div class='conMidtab'>..这个标签则没