Python 提取Twitter tweets中的元素(包括text, screen names, hashtags)

CODE:

#!/usr/bin/python
# -*- coding: utf-8 -*-

‘‘‘
Created on 2014-7-1
@author: guaguastd
@name: extract_tweet_entities.py
‘‘‘

if __name__ == ‘__main__‘:

    import json

    # import search
    from search import search_for_tweet

    # import login, see http://blog.csdn.net/guaguastd/article/details/31706155
    from login import twitter_login

    # get the twitter access api
    twitter_api = twitter_login()

    # import tweet
    from tweet import extract_tweet_entities

    while 1:
        query = raw_input(‘\nInput the query (eg. #MentionSomeoneImportantForYou, exit to quit): ‘)

        if query == ‘exit‘:
            print ‘Successfully exit!‘
            break

        statuses = search_for_tweet(twitter_api, query)
        status_texts,screen_names,hashtags,words = extract_tweet_entities(statuses)  

        # Explore the first 5 items for each...
        print json.dumps(status_texts[0:5], indent=1)
        print json.dumps(screen_names[0:5], indent=1)
        print json.dumps(hashtags[0:5], indent=1)
        print json.dumps(words[0:5], indent=1)

RESULT:

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): #MentionSomeoneImportantForYou
Length of statuses 30
[
 "RT @xmlovex: #MentionSomeoneImportantForYou @purpledrauhl_23",
 "RT @KillahPimpp: #MentionSomeoneImportantForYou @MissRosaa_",
 "#MentionSomeoneImportantForYou @justinbieber",
 "\"@KillahPimpp: #MentionSomeoneImportantForYou @_K_L_O_\"",
 "RT @KillahPimpp: #MentionSomeoneImportantForYou @_K_L_O_"
]
[
 "xmlovex",
 "KillahPimpp",
 "MissRosaa_",
 "justinbieber",
 "KillahPimpp"
]
[
 "MentionSomeoneImportantForYou",
 "MentionSomeoneImportantForYou",
 "MentionSomeoneImportantForYou",
 "MentionSomeoneImportantForYou",
 "MentionSomeoneImportantForYou"
]
[
 "RT",
 "@xmlovex:",
 "#MentionSomeoneImportantForYou",
 "@purpledrauhl_23",
 "RT"
]

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): 
时间: 2024-10-06 07:22:45

Python 提取Twitter tweets中的元素(包括text, screen names, hashtags)的相关文章

Python 提取Twitter tweets中的元素(包含text, screen names, hashtags)

#!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-1 @author: guaguastd @name: tweets.py ''' import json # import search, see http://blog.csdn.net/guaguastd/article/details/35537781 from search import search # import login, see http://bl

Python 提取Twitter特定话题中转载tweet的用户

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-7 @author: guaguastd @name: user_retweet_statuses.py ''' if __name__ == '__main__': # import login, see http://blog.csdn.net/guaguastd/article/details/31706155 from login import tw

Python 提取Twitter用户的Tweet

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-31 @author: guaguastd @name: harvest_user_tweet.py ''' if __name__ == '__main__': # import json import json # import search from search import search_for_tweet # import harvest_use

利用python 提取log 文件中的关键句子,并进行统计分析

利用python开发了一个提取sim.log 中的各个关键步骤中的时间并进行统计的程序: #!/usr/bin/python2.6 import re,datetime file_name='/home/alzhong/logs/qtat1/R2860.01.13/sim-applycommitrollback-bld1.log' file=open(file_name,'r') acnum=[];time_res=[];lnum=0 def trans_time(time): t1=datet

《Python CookBook2》 第四章 Python技巧 - 若列表中某元素存在则返回之 && 在无须共享引用的条件下创建列表的列表

若列表中某元素存在则返回之 任务: 你有一个列表L,还有一个索引号i,若i是有效索引时,返回L[i],若不是,则返回默认值v 解决方案: 列表支持双向索引,所以i可以为负数 >>> def list_get(L,i,v=None): if -len(L)<= i < len(L): return L[i] else: return v >>> list_get([1,2,3,4,5,6],3) 4 异常机制 >>> def list2_ge

Python 提取Twitter转发推文的元素(比如用户名)

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-24 @author: guaguastd @name: extract_retweet_attributions.py ''' if __name__ == '__main__': # import login, see http://blog.csdn.net/guaguastd/article/details/31706155 from login i

python学习--为元组中每个元素命名

官方文档:namedtuple():命名元组函数赋予元组中每个位置的含义,并允许更具可读性的自编写代码.它们可以在任何使用常规元组的地方使用,并且可以通过名称而不是位置索引来添加字段. 实例: from collections import namedtupleStudent=namedtuple('Student',['name','age','sex','email']) 第一个参数为设置创建子类的名字,创建一个Student类的元组子类. 方法返回的就是一个元组的子类.s=Student(

python列表--查找集合中重复元素的个数

方法一: >>> mylist = [1,2,2,2,2,3,3,3,4,4,4,4] >>> myset = set(mylist) >>> for item in myset: print("the %d has found %d" %(item,mylist.count(item))) the 1 has found 1 the 2 has found 4 the 3 has found 3 the 4 has found 4

Python 提取新浪微博的博文中的元素(包含Text, Screen_name)

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-8 @author: guaguastd @name: extractWeiboEntities.py ''' if __name__ == '__main__': import json # get weibo_api to access sina api from sinaWeiboLogin import sinaWeiboLogin sinaWeib