Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析

#!/usr/bin/python
# -*- coding: utf-8 -*-

'''
Created on 2014-7-2
@author: guaguastd
@name: tweet_frequency_analysis.py
'''

if __name__ == '__main__':

    # import Counter
    from collections import Counter

    # pip install prettytable
    from prettytable import PrettyTable

    # import login, see http://blog.csdn.net/guaguastd/article/details/31706155
    from login import oauth_login

    # get the twitter access api
    twitter_api = oauth_login()

    # import tweet, see http://blog.csdn.net/guaguastd/article/details/36163301
    from tweets import tweet

    while 1:
        query = raw_input('\nInput the query (eg. #MentionSomeoneImportantForYou, exit to quit): ')

        if query == 'exit':
            print 'Successfully exit!'
            break

        status_texts,screen_names,hashtags,words = tweet(twitter_api, query)  

        for label, data in (('Word', words),
                            ('Screen Name', screen_names),
                            ('Hashtag', hashtags)):
            pt = PrettyTable(field_names=[label, 'Count'])
            c = Counter(data)
            [ pt.add_row(kv) for kv in c.most_common()[:10]]
            pt.align[label], pt.align['Count'] = 'l', 'r'
            print pt

Result:

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): Hello world
Length of statuses 99
'next_results'
+-------+-------+
| Word  | Count |
+-------+-------+
| the   |    99 |
| hello |    52 |
| is    |    50 |
| in    |    50 |
| me    |    46 |
| best  |    46 |
| you   |    46 |
| world |    44 |
| it    |    42 |
| tweet |    40 |
+-------+-------+
+--------------+-------+
| Screen Name  | Count |
+--------------+-------+
| Harry_Styles |    39 |
| justinbieber |     6 |
| shots        |     6 |
| john         |     6 |
| WHATCHAKNO   |     4 |
| hatahata88   |     2 |
| Michael5SOS  |     2 |
| Oprah_World  |     1 |
| kuga_aimu    |     1 |
| chriscobbins |     1 |
+--------------+-------+
+--------------+-------+
| Hashtag      | Count |
+--------------+-------+
| MoneyAnthem  |     4 |
| MILLIONBUCKS |     4 |
| New          |     4 |
| MUSTHEAR     |     4 |
| WorldCup2014 |     2 |
| gousa        |     1 |
| Lukaku       |     1 |
| USA          |     1 |
| BEL          |     1 |
| MGWV         |     1 |
+--------------+-------+

Input the query (eg. #MentionSomeoneImportantForYou, exit to quit):

时间： 2024-12-31 05:10:24

Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析的相关文章

Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的词汇多样性分析

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-3 @author: guaguastd @name: tweet_lexical_diversity.py ''' # Compute lexical diversity def lexical_diversity(tokens): return 1.0*len(set(tokens))/len(tokens) # Compute the average

Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-9 @author: guaguastd @name: weiboFrequencyAnalysis.py ''' if __name__ == '__main__': # get weibo_api to access sina api from sinaWeiboLogin import sinaWeiboLogin sinaWeiboApi = sin

Python 对新浪微博的元素 (Word, Screen Name)的词汇多样性分析

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-10 @author: guaguastd @name: weiboLexicalDiversity.py ''' if __name__ == '__main__': # get weibo_api to access sina api from sinaWeiboLogin import sinaWeiboLogin sinaWeiboApi = sin

Python 提取Twitter tweets中的元素（包含text, screen names, hashtags）

#!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-1 @author: guaguastd @name: tweets.py ''' import json # import search, see http://blog.csdn.net/guaguastd/article/details/35537781 from search import search # import login, see http://bl

Python 对Twitter中指定话题的Tweet基本元素的频谱分析

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-9 @author: guaguastd @name: entities_frequency_map.py ''' if __name__ == '__main__': # import Counter from collections import Counter # import visualize from visualize import visua

Python 可视化Twitter中指定话题中Tweet的词汇频率

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-8 @author: guaguastd @name: plot_frequencies_words.py ''' if __name__ == '__main__': #import json # import Counter from collections import Counter # import search from search impor

Python 查找Twitter中最流行(转载最多)的10个Tweet

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-4 @author: guaguastd @name: find_popular_retweets.py ''' # Finding the most popular retweets def popular_retweets(statuses): retweets = [ # Store out a tuple of these three values.

Python 提取Twitter用户的Tweet

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-31 @author: guaguastd @name: harvest_user_tweet.py ''' if __name__ == '__main__': # import json import json # import search from search import search_for_tweet # import harvest_use

Python 对Twitter中指定话题的被转载Tweet数量的频谱分析

CODE: #!/usr/bin/python # -*- coding: utf-8 -*- ''' Created on 2014-7-10 @author: guaguastd @name: retweet_frequency_map.py ''' if __name__ == '__main__': # import visualize from visualize import visualize_frequency_map # pip install prettytable # fr