asp.net MVC 抓取微信文章数据（正文）

1.抓微信的正文主要是调用第三方的接口(https://market.aliyun.com/products/56928004/cmapi012134.html)

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Net;
using System.Net.Security;
using System.Security.Cryptography.X509Certificates;
using System.Text;
using System.Threading.Tasks;

namespace QBSqlServer.GSDataAPIs.GetHtml
{
    public class WeChatPublicNumberQueryAPI
    {
        private const String host = "https://ali-weixin.showapi.com";
        private const String path = "/582-9";
        private const String method = "GET";
        private const String appcode = "你自己的appcode";

        public static Root GetWeChathtml(string title)
        {
            string outhtml = string.Empty;
            string t = System.Web.HttpUtility.UrlEncode(title);
            //String querys = "needComment=0&needContent=1&url=url";
            String querys = "needContent=1&url=" + t;
            String bodys = "";
            String url = host + path;
            HttpWebRequest httpRequest = null;
            HttpWebResponse httpResponse = null;

            if (0 < querys.Length)
            {
                url = url + "?" + querys;
            }

            if (host.Contains("https://"))
            {
                ServicePointManager.ServerCertificateValidationCallback = new RemoteCertificateValidationCallback(CheckValidationResult);
                httpRequest = (HttpWebRequest)WebRequest.CreateDefault(new Uri(url));
            }
            else
            {
                httpRequest = (HttpWebRequest)WebRequest.Create(url);
            }
            httpRequest.Method = method;
            httpRequest.Headers.Add("Authorization", "APPCODE " + appcode);
            if (0 < bodys.Length)
            {
                byte[] data = Encoding.UTF8.GetBytes(bodys);
                using (Stream stream = httpRequest.GetRequestStream())
                {
                    stream.Write(data, 0, data.Length);
                }
            }
            try
            {
                httpResponse = (HttpWebResponse)httpRequest.GetResponse();
            }
            catch (WebException ex)
            {
                httpResponse = (HttpWebResponse)ex.Response;
            }

            Console.WriteLine(httpResponse.StatusCode);
            Console.WriteLine(httpResponse.Method);
            Console.WriteLine(httpResponse.Headers);
            Stream st = httpResponse.GetResponseStream();
            StreamReader reader = new StreamReader(st, Encoding.GetEncoding("utf-8"));
            string strResult = reader.ReadToEnd();
            Root jobInfoList = JsonConvert.DeserializeObject<Root>(strResult);
            Console.WriteLine(reader.ReadToEnd());
            Console.WriteLine("\n");
            return jobInfoList;
        }

        public static bool CheckValidationResult(object sender, X509Certificate certificate, X509Chain chain, SslPolicyErrors errors)
        {
            return true;
        }
    }

    public class Showapi_res_body
    {
        /// <summary>
        ///
        /// </summary>
        public string newUrl { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string date { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string weixinNum { get; set; }
        /// <summary>
        /// 这是正文的html
        /// </summary>
        public string content { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string ret_code { get; set; }
        /// <summary>
        /// 秀场｜中国品牌ELLASSAY米兰时装周首秀！
        /// </summary>
        public string title { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string contentImg { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string userLogo { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string oldUrl { get; set; }
        /// <summary>
        /// 徐峰立
        /// </summary>
        public string userName { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string read_num { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string like_num { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string userLogo_code { get; set; }
    }

    public class Root
    {
        /// <summary>
        ///
        /// </summary>
        public string showapi_res_code { get; set; }
        /// <summary>
        ///
        /// </summary>
        public string showapi_res_error { get; set; }
        /// <summary>
        ///
        /// </summary>
        public Showapi_res_body showapi_res_body { get; set; }
    }
}

时间： 2024-10-11 03:30:08

asp.net MVC 抓取微信文章数据（正文）的相关文章

asp.net mvc抓取微信文章里面所有的图片

/// <summary> /// 下载指定URL下的所有图片 /// </summary> public class WebPageImage { /// <summary> /// 获取网页中全部图片 /// </summary> /// <param name="url">网页地址</param> /// <param name="charSet">网页编码,为空自动判断<

[Python爬虫] 之十五：Selenium +phantomjs根据微信公众号抓取微信文章

借助搜索微信搜索引擎进行抓取抓取过程 1.首先在搜狗的微信搜索页面测试一下,这样能够让我们的思路更加清晰在搜索引擎上使用微信公众号英文名进行“搜公众号”操作(因为公众号英文名是公众号唯一的,而中文名可能会有重复,同时公众号名字一定要完全正确,不然可能搜到很多东西,这样我们可以减少数据的筛选工作, 只要找到这个唯一英文名对应的那条数据即可),即发送请求到'http://weixin.sogou.com/weixin?type=1&query=%s&ie=utf8&_sug_=n&

使用redis所维护的代理池抓取微信文章

搜狗搜索可以直接搜索微信文章,本次就是利用搜狗搜搜出微信文章,获得详细的文章url来得到文章的信息.并把我们感兴趣的内容存入到mongodb中. 因为搜狗搜索微信文章的反爬虫比较强,经常封IP,所以要在封了IP之后切换IP,这里用到github上的一个开源类,当运行这个类时,就会动态的在redis中维护一个ip池,并通过flask映射到网页中,可以通过访问 localhost:5000/get/ 来获取IP 这是搜狗微信搜索的页面, 构造搜索url .搜索时会传递的参数,通过firefox浏览器

根据微信号来抓取微信文章

用代理抓取微信文章

GitHub:https://github.com/LXL-YAN/weixinArticles 原文地址:https://www.cnblogs.com/LXL616/p/10759571.html

如何利用Python网络爬虫抓取微信好友数量以及微信好友的男女比例

前几天给大家分享了利用Python网络爬虫抓取微信朋友圈的动态(上)和利用Python网络爬虫爬取微信朋友圈动态--附代码(下),并且对抓取到的数据进行了Python词云和wordart可视化,感兴趣的伙伴可以戳这篇文章:利用Python词云和wordart可视化工具对朋友圈数据进行可视化. 今天我们继续focus on微信,不过这次给大家带来的是利用Python网络爬虫抓取微信好友总数量和微信好友男女性别的分布情况.代码实现蛮简单的,具体的教程如下. 相信大家都知道,直接通过网页抓取微信的数据

使用fiddler抓取微信公众号文章的阅读数、点赞数、评论数

1 设置fiddler支持https 打开fiddler,在菜单栏中依次选择 [Tools]->[Options]->[HTTPS],勾上如下图的选项: 单击Actions,选择Export Root Certificate to Desktop(导出证书到桌面)选项: 安装证书: 在桌面上找到FiddlerRoot.cer文件,双击进行安装直到导入成功. 2 配置fiddler抓取规则在菜单栏中依次选择 [Rules]->[Customize Rules] 弹出Fiddler Scr

微信好友大揭秘，使用Python抓取朋友圈数据，通过人脸识别全面分析好友，一起看透你的“朋友圈”

微信:一个提供即时通讯服务的应用程序,更是一种生活方式,超过数十亿的使用者,越来越多的人选择使用它来沟通交流. 不知从何时起,我们的生活离不开微信,每天睁开眼的第一件事就是打开微信,关注着朋友圈里好友的动态,而朋友圈中或虚或实的状态更新,似乎都在证明自己的"有趣",寻找那份或有或无的存在感. 有人选择在朋友圈记录生活的点滴,有人选择在朋友圈展示自己的观点.有时我们想去展示自己,有时又想去窥探着别人的生活,而有时又不想别人过多的了解自己的生活,或是屏蔽对方,或是不给对方看朋友圈,又或是不

如何利用Python网络爬虫抓取微信朋友圈的动态（上）

今天小编给大家分享一下如何利用Python网络爬虫抓取微信朋友圈的动态信息,实际上如果单独的去爬取朋友圈的话,难度会非常大,因为微信没有提供向网易云音乐这样的API接口,所以很容易找不到门.不过不要慌,小编在网上找到了第三方工具,它可以将朋友圈进行导出,之后便可以像我们正常爬虫网页一样进行抓取信息了. [出书啦]就提供了这样一种服务,支持朋友圈导出,并排版生成微信书.本文的主要参考资料来源于这篇博文:https://www.cnblogs.com/sheng-jie/p/7776495.html