Python 收集Twitter时间序列数据

CODE:

#!/usr/bin/python
# -*- coding: utf-8 -*-

'''
Created on 2014-7-18
@author: guaguastd
@name: collect_time_series.py
'''

if __name__ == '__main__':

    # import json
    import json

    # import partial
    from functools import partial

    # import trend
    from trend import twitter_trends

    # import time_series
    from time_series import get_time_series_data

    # import login, see http://blog.csdn.net/guaguastd/article/details/31706155
    from login import twitter_login

    # get the twitter access api
    twitter_api = twitter_login()

    # sample usage
    WORLD_WOE_ID = 1
    pp = partial(json.dumps, indent=1)
    twitter_world_trends = partial(twitter_trends, twitter_api, WORLD_WOE_ID)

    # collect time series
    get_time_series_data(twitter_world_trends, 'time-series', 'twitter_world_trends')

RESULT:

data:
[{u'locations': [{u'woeid': 1, u'name': u'Worldwide'}], u'created_at': u'2014-07-17T22:46:34Z', u'_id': ObjectId('53c852dcae6f221648bfdde9'), u'trends': [{u'url': u'http://twitter.com/search?q=%23MH17', u'query': u'%23MH17', u'name': u'#MH17', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23GazzeSiyonizmeMezarOlacak', u'query': u'%23GazzeSiyonizmeMezarOlacak', u'name': u'#GazzeSiyonizmeMezarOlacak', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23PrayForMH17', u'query': u'%23PrayForMH17', u'name': u'#PrayForMH17', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23LouisWeLoveYou', u'query': u'%23LouisWeLoveYou', u'name': u'#LouisWeLoveYou', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23SpamIsraelinTurkey', u'query': u'%23SpamIsraelinTurkey', u'name': u'#SpamIsraelinTurkey', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22MuhsinBa%C5%9Fkan+Erdo%C4%9Fan%C4%B1Desteklerdi%22', u'query': u'%22MuhsinBa%C5%9Fkan+Erdo%C4%9Fan%C4%B1Desteklerdi%22', u'name': u'MuhsinBa\u015fkan Erdo\u011fan\u0131Desteklerdi', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22SoyunKurusun+Katilisrail%22', u'query': u'%22SoyunKurusun+Katilisrail%22', u'name': u'SoyunKurusun Katilisrail', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22IsraelIsSlaughtering+TheworldIsWatching%22', u'query': u'%22IsraelIsSlaughtering+TheworldIsWatching%22', u'name': u'IsraelIsSlaughtering TheworldIsWatching', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22Elaine+Stritch%22', u'query': u'%22Elaine+Stritch%22', u'name': u'Elaine Stritch', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22%C5%9EimdiDuaVakti+GazzeYan%C4%B1yor%22', u'query': u'%22%C5%9EimdiDuaVakti+GazzeYan%C4%B1yor%22', u'name': u'\u015eimdiDuaVakti GazzeYan\u0131yor', u'promoted_content': None}], u'as_of': u'2014-07-17T22:49:01Z'}]

Write 1 trends
Zzz...
data:
[{u'locations': [{u'woeid': 1, u'name': u'Worldwide'}], u'created_at': u'2014-07-17T22:46:34Z', u'_id': ObjectId('53c8531eae6f221648bfddea'), u'trends': [{u'url': u'http://twitter.com/search?q=%23MH17', u'query': u'%23MH17', u'name': u'#MH17', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23GazzeSiyonizmeMezarOlacak', u'query': u'%23GazzeSiyonizmeMezarOlacak', u'name': u'#GazzeSiyonizmeMezarOlacak', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23PrayForMH17', u'query': u'%23PrayForMH17', u'name': u'#PrayForMH17', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23LouisWeLoveYou', u'query': u'%23LouisWeLoveYou', u'name': u'#LouisWeLoveYou', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23SpamIsraelinTurkey', u'query': u'%23SpamIsraelinTurkey', u'name': u'#SpamIsraelinTurkey', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22MuhsinBa%C5%9Fkan+Erdo%C4%9Fan%C4%B1Desteklerdi%22', u'query': u'%22MuhsinBa%C5%9Fkan+Erdo%C4%9Fan%C4%B1Desteklerdi%22', u'name': u'MuhsinBa\u015fkan Erdo\u011fan\u0131Desteklerdi', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22SoyunKurusun+Katilisrail%22', u'query': u'%22SoyunKurusun+Katilisrail%22', u'name': u'SoyunKurusun Katilisrail', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22IsraelIsSlaughtering+TheworldIsWatching%22', u'query': u'%22IsraelIsSlaughtering+TheworldIsWatching%22', u'name': u'IsraelIsSlaughtering TheworldIsWatching', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22Elaine+Stritch%22', u'query': u'%22Elaine+Stritch%22', u'name': u'Elaine Stritch', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22%C5%9EimdiDuaVakti+GazzeYan%C4%B1yor%22', u'query': u'%22%C5%9EimdiDuaVakti+GazzeYan%C4%B1yor%22', u'name': u'\u015eimdiDuaVakti GazzeYan\u0131yor', u'promoted_content': None}], u'as_of': u'2014-07-17T22:50:08Z'}]
Write 1 trends
Zzz...

data:
[{u'locations': [{u'woeid': 1, u'name': u'Worldwide'}], u'created_at': u'2014-07-17T22:46:34Z', u'_id': ObjectId('53c85361ae6f221648bfddeb'), u'trends': [{u'url': u'http://twitter.com/search?q=%23MH17', u'query': u'%23MH17', u'name': u'#MH17', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23GazzeSiyonizmeMezarOlacak', u'query': u'%23GazzeSiyonizmeMezarOlacak', u'name': u'#GazzeSiyonizmeMezarOlacak', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23PrayForMH17', u'query': u'%23PrayForMH17', u'name': u'#PrayForMH17', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23LouisWeLoveYou', u'query': u'%23LouisWeLoveYou', u'name': u'#LouisWeLoveYou', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%23SpamIsraelinTurkey', u'query': u'%23SpamIsraelinTurkey', u'name': u'#SpamIsraelinTurkey', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22MuhsinBa%C5%9Fkan+Erdo%C4%9Fan%C4%B1Desteklerdi%22', u'query': u'%22MuhsinBa%C5%9Fkan+Erdo%C4%9Fan%C4%B1Desteklerdi%22', u'name': u'MuhsinBa\u015fkan Erdo\u011fan\u0131Desteklerdi', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22SoyunKurusun+Katilisrail%22', u'query': u'%22SoyunKurusun+Katilisrail%22', u'name': u'SoyunKurusun Katilisrail', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22IsraelIsSlaughtering+TheworldIsWatching%22', u'query': u'%22IsraelIsSlaughtering+TheworldIsWatching%22', u'name': u'IsraelIsSlaughtering TheworldIsWatching', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22Elaine+Stritch%22', u'query': u'%22Elaine+Stritch%22', u'name': u'Elaine Stritch', u'promoted_content': None}, {u'url': u'http://twitter.com/search?q=%22%C5%9EimdiDuaVakti+GazzeYan%C4%B1yor%22', u'query': u'%22%C5%9EimdiDuaVakti+GazzeYan%C4%B1yor%22', u'name': u'\u015eimdiDuaVakti GazzeYan\u0131yor', u'promoted_content': None}], u'as_of': u'2014-07-17T22:51:15Z'}]

Write 1 trends
Zzz...

Python 收集Twitter时间序列数据

时间: 2024-11-29 08:28:41

Python 收集Twitter时间序列数据的相关文章

python时间序列数据的对齐和数据库的分批查询

欲直接下载代码文件,关注我们的公众号哦!查看历史消息即可! 0. 前言 在机器学习里,我们对时间序列数据做预处理的时候,经常会碰到一个问题:有多个时间序列存在多个表里,每个表的的时间轴不完全相同,要如何把这些表在时间轴上进行对齐,从而合并成一个表呢?尤其是当这些表都存在数据库里,而且超级超级大的时候,怎样才能更高效地处理呢? 在上一篇文章中,已经介绍过了如何在Python中创建数据库连接以及对数据库进行增删改查.分组聚合以及批量读取和处理等操作. 今天就以上面的问题为导向,手把手教你如何用Pyt

地铁译:Spark for python developers ---Spark的数据戏法

聚焦在 Twitter 上关于Apache Spark的数据, 这些是准备用于机器学习和流式处理应用的数据. 重点是如何通过分布式网络交换代码和数据,获得 串行化, 持久化 , 调度和缓存的实战经验 . 认真使用 Spark SQL, 交互性探索结构化和半结构化数据. Spark SQL 的基础数据结构是?Spark dataframe, Spark dataframe 受到了 Python Pandas?dataframe 和R dataframe 的启发. 这是一个强大的数据结构, 有R 或

Python收集主机信息

Python收集linux主机信息,需要安装dmidecode命令,yum -y install dmidecode #!/usr/bin/env python # coding=utf-8   from subprocess import Popen, PIPE   #获取ifconfig命令信息 def getIfconfig():     p = Popen(['ifconfig'], stdout=PIPE)     data = p.stdout.read().decode()    

横截面数据、时间序列数据、面板数据

面板数据(Panel Data)是将"截面数据"和"时间序列数据"综合起来的一种数据类型.具有"横截面"和"时间序列"两个维度,当这类数据按两个维度进行排列时,数据都排在一个平面上,与排在一条线上的一维数据有着明显的不同,整个表格像是一个面板,所以称为面板数据(Panel Data). 实际上如果从数据结构内在含义上,应该把Panel Data称为"时间序列-截面数据",更能体现数据结构本质上的特点.该数据

Python数据分析 之时间序列基础

1. 时间序列基础 import numpy as np import pandas as pd np.random.seed(12345) import matplotlib.pyplot as plt plt.rc('figure', figsize=(10, 6)) PREVIOUS_MAX_ROWS = pd.options.display.max_rows pd.options.display.max_rows = 20 np.set_printoptions(precision=4,

HBase模式案例日志数据和时间序列数据

感谢平台分享-http://bjbsair.com/2020-04-10/tech-info/53339.html 本文为你介绍了 HBase 模式案例之一:日志数据和时间序列数据 假设正在收集以下数据元素. 主机名(Hostname) 时间戳(timestamp) 日志事件(Log event) 值/消息(Value/message) 我们可以将它们存储在名为 LOG_DATA 的 HBase 表中,但 rowkey 会是什么呢?从这些属性中,rowkey 将是主机名,时间戳和日志事件的一些组

Python下载Yahoo!Finance数据

Python下载Yahoo!Finance数据的三种工具: (1)yahoo-finance package. (2)ystockquote. (3)pandas.

Python 可视化Twitter中指定话题中Tweet的词汇频率

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-8 @author: guaguastd @name: plot_frequencies_words.py ''' if __name__ == '__main__': #import json # import Counter from collections import Counter # import search from search impor

Python 查找Twitter中最流行(转载最多)的10个Tweet

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-4 @author: guaguastd @name: find_popular_retweets.py ''' # Finding the most popular retweets def popular_retweets(statuses): retweets = [ # Store out a tuple of these three values.