php 抓取微信列表中的最新的一组微信消息

<?php 

$_G['wx_g'] = array('init' => array(
				"wx_content" => array("weixin_user" => "微信号码", "weixin_pass" => "微信密码")
			)
);

wx_login();
$messge_list = get_message_list();
$file_id=$messge_list['item'][0]['multi_item'][0]['file_id'];
//print_r($messge_list);exit;
if(!DB::result_first("select count(weiyi_id) from test.yangang_jiaojing where weiyi_id={$file_id} ")){
	DB::query("delete from  test.yangang_jiaojing");
	foreach ($messge_list['item'][0]['multi_item'] as $key => $val){
			$val['title']=mb_convert_encoding($val['title'], 'GBK','UTF-8');
			$val['weiyi_id']=mb_convert_encoding($val['file_id'], 'GBK','UTF-8');
			$val['des']=mb_convert_encoding($val['digest'], 'GBK','UTF-8');
			$val['picurl']=$val['cover'];
			$val['detail']=$val['content_url'];
			$query_cheng = "INSERT INTO test.yangang_jiaojing(weiyi_id,title,pic_url,detail_url,des)VALUES ({$val['weiyi_id']},'{$val['title']}','{$val['picurl']}','{$val['detail']}','{$val['des']}')";
			$count1=DB::query($query_cheng);
	}
}

function get_message_list(){

	global $_G;

	$cookie=$_G['wx_g']['cookie'];

	$url = "https://mp.weixin.qq.com/cgi-bin/appmsg?begin=0&count=2&t=media/appmsg_list&type=10&action=list&token=".$_G['wx_g']['token']."&lang=zh_CN";

	$ch = curl_init();

	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_COOKIE, $cookie);
	curl_setopt($ch, CURLOPT_REFERER, "https://mp.weixin.qq.com/cgi-bin/appmsg?begin=0&count=2&t=media/appmsg_list&type=10&action=list&token=".$_G['wx_g']['token']."&lang=zh_CN");
	curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0");
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
	curl_setopt($ch, CURLOPT_SSLVERSION, 3);
	$output2 = curl_exec($ch);
	curl_close($ch);
	//echo $output2;exit;
	$output1=explode('wx.cgiData = ',$output2);
	$output1=$output1[1];
	$output1=explode(',"file_cnt":',$output1);
	$output1=$output1[0];
	$output1.='}';

	$message_list=json_decode($output1,true);
	//$message_list=mb_convert_encoding($message_list, "GBK","UTF-8");
	//print_r($message_list);exit;

	return $message_list;

}

function wx_login(){

	global $_G;
	//echo $_G['wx_g']['init']['wx_content']['weixin_user'];exit;
	$username = $_G['wx_g']['init']['wx_content']['weixin_user'];
	$pwd = md5($_G['wx_g']['init']['wx_content']['weixin_pass']);

	$url = "https://mp.weixin.qq.com/cgi-bin/login?lang=zh_CN";
	$post_data = "username=".$username."&pwd=".$pwd."&imgcode=&f=json";
	$cookie = "pgv_pvid=2067516646";
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_HEADER, 1);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_POST, 1);
	curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
	curl_setopt($ch, CURLOPT_COOKIE, $cookie);
	curl_setopt($ch, CURLOPT_REFERER, "https://mp.weixin.qq.com/cgi-bin/loginpage?t=wxm2-login&lang=zh_CN");
	curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0");
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
	curl_setopt($ch, CURLOPT_SSLVERSION, 3);
	$output = curl_exec($ch);
	curl_close($ch);

	//echo $output;exit;

	list($header, $body) = explode("\r\n\r\n", $output);

	preg_match_all("/set\-cookie:([^\r\n]*)/i", $header, $matches);

	if(!empty($matches[1][2])){
		$cookie = $matches[1][0].$matches[1][1].$matches[1][2].$matches[1][3];
	}else{
		$cookie = $matches[1][0].$matches[1][1];
	}

	$cookie = str_replace(array('Path=/',' ; Secure; HttpOnly','=;'),array('','','='), $cookie);
	$cookie = 'pgv_pvid=6648492946;'.$cookie;

	$data = json_decode($body,true);
	$result = explode('token=',$data['redirect_url']);
	$token = $result[1];
	if(!$token) cpmsg($installlang['import_error_password'], "{$request_url}&step=import&pswerror=1", 'error');

	//写入到全局变量
	$_G['wx_g']['cookie'] = $cookie;
	$_G['wx_g']['token'] = $token;

}

?>

CREATE TABLE IF NOT EXISTS `yangang_jiaojing` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(100) NOT NULL,
  `des` varchar(300) NOT NULL,
  `detail_url` varchar(300) NOT NULL,
  `pic_url` varchar(300) NOT NULL,
  `note` varchar(50) NOT NULL,
  `weiyi_id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=gbk AUTO_INCREMENT=1 ;

时间： 2024-10-11 03:07:22

php 抓取微信列表中的最新的一组微信消息的相关文章

Python抓取手机APP中内容

首先下载Wireshark和模拟器(天天模拟器,夜神模拟器),天天模拟器在自带的应用商店里面能够登录微信. 然后打开Wireshark选择一个网卡开始抓包. 开始抓包后,在模拟器中要抓取的APP中对想要的资源进行点击访问,操作完成后停止捕捉,排除干扰. Filter用于过滤数据,在里面输入选择的过滤条件.相关的语法和设置可以在网上查找资料,这里干扰项较少,直接选出http协议的数据包. 点击所需要的数据进行分析,可以将里面的字段复制出来构造Header然后对资源进行访问. 1 # coding:

[Python爬虫] 之十：Selenium +phantomjs抓取活动行中会议活动（多线程抓取）

延续上个抓取活动行中会议活动的问题,上次使用是单线程的抓取,效率较低,现在使用多线程的抓取. 数据的抓取分为两个过程:首先获取每个关键字搜索结果对应的url和页数,保存在列表里面,这个过程用一个线程来实现(类似生产者),同时根据获取的关键字的url和页数,抓取对应的数据,这个过程用多线程来抓取(类似消费者) 这样整个抓取过程共用了144.366188 秒,采用单线程来进行抓取要用大概184秒,这样大概节省了40秒具体代码如下: # coding=utf-8import osimport ref

用python+selenium抓取豆瓣读书中最受关注图书，按照评分排序

抓取豆瓣读书中的(http://book.douban.com/)最受关注图书,按照评分排序,并保存至txt文件中,需要抓取书籍的名称,作者,评分,体裁和一句话评论 #coding=utf-8 from selenium import webdriver from time import sleep class DoubanPopularBook: def __init__(self): self.dr = webdriver.Chrome() self.

Java写的抓取任意网页中email地址的小程序

/* * 从网页中抓取邮箱地址 * 正则表达式:java.util.regex.Pattern * 1.定义好邮箱的正则表达式 * 2.对正则表达式预编译 * 3.对正则和网页中的邮箱格式进行匹配 * 4.找到匹配结果 * 5.通过网络程序,打通机器和互联网的一个网站的连接 */ import java.net.*; import java.util.regex.*; import java.io.*; public class EmailAddressFetch { public static

使用burpsuite等代理工具抓取docker容器中的数据包

使用burpsuite等代理工具抓取docker容器中的数据包,下面是详细的教程. 以docker中的某个漏洞平台(bwapp)为例,展示如何抓包. 1.首先使用docker下载bwapp: # docker pull raesene/bwapp 2.然后运行bwapp (使用命令 --env HTTP_PROXY="http://192.168.43.14:8080" 来进行代理设置,这里设置的是burpsuite中的ip地址以及端口) # docker run -d --n

[Python爬虫] 之九：Selenium +phantomjs抓取活动行中会议活动（单线程抓取）

思路是这样的,给一系列关键字:互联网电视:智能电视:数字:影音:家庭娱乐:节目:视听:版权:数据等.在活动行网站搜索页(http://www.huodongxing.com/search?city=%E5%85%A8%E5%9B%BD&pi=1)的文本输入框中分别输入每个关键字,在搜索结果中抓取需要的数据. 首先通过Selenium+IE驱动得到每个关键字搜索结果的url(首页,因为以后各个页的url就是索引不一样)和总页数,保存的列表里面.然后再循环列表,用Selenium +phantomj

用python+selenium抓取豆瓣电影中的正在热映前12部电影并按评分排序

抓取豆瓣电影(http://movie.douban.com/nowplaying/chengdu/)中的正在热映前12部电影,并按照评分排序,保存至txt文件 #coding=utf-8 from selenium import webdriver import unittest from time import sleep class DoubanMovie(unittest.TestCase): def setUp(self): self.dr = webdriv

如何使用JAVA语言抓取某个网页中的邮箱地址

现实生活中咱们常常在浏览网页时看到自己需要的信息,但由于信息过于庞大而又不能逐个保存下来. 接下来,咱们就以获取邮箱地址为例,使用java语言抓取网页中的邮箱地址实现思路如下: 1.使用Java.net.URL对象,绑定网络上某一个网页的地址 2.通过java.net.URL对象的openConnection()方法获得一个URLConnection对象 3.通过URLConnection对象的getInputStream()方法获得该网络文件的输入流对象InputStream 4.循环读取流

J哥--------------------------分享好东西：android抓包工具fiddler使用介绍抓取手机APP 中资源。

本文地址:http://blog.csdn.net/u011733020 首先,写这个只是为了学习,不是要做什么违法的事情,如果有问题,有关部门请联系我,立马删除. 不要查我水表. 正题:这里介绍抓包的关键,Fiddler ,Fiddler是一个http协议调试代理工具,它能够记录并检查所有你的电脑和互联网之间的http通讯. 我们就是用这款软件抓取 ,我们手机app 访问的资源路径的. 下面我们拿实例来演示下,怎么用fiddler 抓取数据.(以某拍为例吧). 环境: win7