sentiment analysis(very ish est less)

import jiebaimport numpy as np

#打开词典文件,返回列表def open_dict(Dict = ‘mini‘, path=r‘/Users/apple888/PycharmProjects/Textming/Sent_Dict/Hownet/‘):    path = path + ‘%s.txt‘ % Dict    dictionary = open(path, ‘r‘, encoding=‘utf-8‘)    dict = []    for word in dictionary:        word = word.strip(‘\n‘)        dict.append(word)    return dict

def judgeodd(num):    if (num % 2) == 0:        return ‘even‘    else:        return ‘odd‘

#注意,这里你要修改path路径。deny_word = open_dict(Dict = ‘否定词‘, path= r‘C:/Users/Administrator/Desktop/Textming/‘)posdict = open_dict(Dict = ‘positive‘, path= r‘C:/Users/Administrator/Desktop/Textming/‘)negdict = open_dict(Dict = ‘negative‘, path= r‘C:/Users/Administrator/Desktop/Textming/‘)

degree_word = open_dict(Dict = ‘程度级别词语‘, path= r‘C:/Users/Administrator/Desktop/Textming/‘)mostdict = degree_word[degree_word.index(‘extreme‘)+1 : degree_word.index(‘very‘)]#权重4,即在情感词前乘以4verydict = degree_word[degree_word.index(‘very‘)+1 : degree_word.index(‘more‘)]#权重3moredict = degree_word[degree_word.index(‘more‘)+1 : degree_word.index(‘ish‘)]#权重2ishdict = degree_word[degree_word.index(‘ish‘)+1 : degree_word.index(‘last‘)]#权重0.5

def sentiment_score_list(dataset):    seg_sentence = dataset.split(‘。‘)    for item in seg_sentence:        item.split(‘,‘)

count1 = []    count2 = []    for sen in seg_sentence: #循环遍历每一个评论        segtmp = jieba.lcut(sen, cut_all=False)  #把句子进行分词,以列表的形式返回        i = 0 #记录扫描到的词的位置        a = 0 #记录情感词的位置        poscount = 0 #积极词的第一次分值        sinsitive_count1=0        sinsitive_count2 = 0        poscount2 = 0 #积极词反转后的分值        poscount3 = 0 #积极词的最后分值(包括叹号的分值)        negcount = 0        negcount2 = 0        negcount3 = 0        for word in segtmp:            if word in posdict:  # 判断词语是否是情感词                poscount += 1                sinsitive_count1+=1                c = 0                for w in segtmp[a:i]:  # 扫描情感词前的程度词                    if w in mostdict:                        poscount *= 4.0                    elif w in verydict:                        poscount *= 3.0                    elif w in moredict:                        poscount *= 2.0                    elif w in ishdict:                        poscount *= 0.5                    elif w in deny_word:                        c += 1                if judgeodd(c) == ‘odd‘:  # 扫描情感词前的否定词数                    poscount *= -1.0                    poscount2 += poscount                    poscount = 0                    poscount3 = poscount + poscount2 + poscount3                    poscount2 = 0                else:                    poscount3 = poscount + poscount2 + poscount3                    poscount = 0                a = i + 1  # 情感词的位置变化

elif word in negdict:  # 消极情感的分析,与上面一致                negcount += 1                sinsitive_count2+=1                d = 0                for w in segtmp[a:i]:                    if w in mostdict:                        negcount *= 4.0                    elif w in verydict:                        negcount *= 3.0                    elif w in moredict:                        negcount *= 2.0                    elif w in ishdict:                        negcount *= 0.5                    elif w in degree_word:                        d += 1                if judgeodd(d) == ‘odd‘:                    negcount *= -1.0                    negcount2 += negcount                    negcount = 0                    negcount3 = negcount + negcount2 + negcount3                    negcount2 = 0                else:                    negcount3 = negcount + negcount2 + negcount3                    negcount = 0                a = i + 1            elif word == ‘!‘ or word == ‘!‘:  ##判断句子是否有感叹号                for w2 in segtmp[::-1]:  # 扫描感叹号前的情感词,发现后权值+2,然后退出循环                    if w2 in posdict or negdict:                        poscount3 += 2                        negcount3 += 2                        sinsitive_count1+=1                        sinsitive_count2+=1                        break            i += 1 # 扫描词位置前移

# 以下是防止出现负数的情况            pos_count = 0            neg_count = 0            if poscount3 < 0 and negcount3 > 0:                neg_count += negcount3 - poscount3                pos_count = 0            elif negcount3 < 0 and poscount3 > 0:                pos_count = poscount3 - negcount3                neg_count = 0            elif poscount3 < 0 and negcount3 < 0:                neg_count = -poscount3                pos_count = -negcount3            else:                pos_count = poscount3                neg_count = negcount3

count1.append([pos_count, neg_count])        count2.append(count1)        count1 = []

return count2

def sentiment_score(senti_score_list):    score = []    for review in senti_score_list:        score_array = np.array(review)        print(score_array)        Pos = np.sum(score_array[:, 0])        Neg = np.sum(score_array[:, 1])        AvgPos = np.mean(score_array[:, 0])        AvgPos = float(‘%.1f‘%AvgPos)        AvgNeg = np.mean(score_array[:, 1])        AvgNeg = float(‘%.1f‘%AvgNeg)        StdPos = np.std(score_array[:, 0])        StdPos = float(‘%.1f‘%StdPos)        StdNeg = np.std(score_array[:, 1])        StdNeg = float(‘%.1f‘%StdNeg)        score.append([Pos, Neg, AvgPos, AvgNeg, StdPos, StdNeg])

return score

data = ‘你就是坑人的,什么玩意!你们的手机真不好用!非常生气,我非常郁闷!!!!‘data2= ‘我好开心啊,非常非常非常高兴!今天我得了一百分,我很兴奋开心,愉快,开心‘print(sentiment_score(sentiment_score_list(data)))print(sentiment_score(sentiment_score_list(data2)))

原文地址:https://www.cnblogs.com/rabbittail/p/8336291.html

时间: 2024-10-11 05:14:05

sentiment analysis(very ish est less)的相关文章

Sentiment Analysis(1)-Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables

The content is from this paper: Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables, by Tetsuji Nakagawa. A typical approach for sentiment classification is to use supervised machine learning algorithms with bag-of-words a

Paper Weekly-Opinion mining and sentiment analysis

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts http://www.aclweb.org/anthology/P04-1035 by B Pang -2004- ?Cited by 2242 Large-Scale Sentiment Analysis for News and Blogs http://icwsm.org/papers/3--G

Sentiment Analysis resources

Wikipedia: Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. In 1997, firstly proposed b

NAACL 2013 Paper Mining User Relations from Online Discussions using Sentiment Analysis and PMF

中文简单介绍:本文对怎样基于情感分析和概率矩阵分解从网络论坛讨论中挖掘用户关系进行了深入研究. 论文出处:NAACL'13. 英文摘要: Advances in sentiment analysis have enabled extraction of user relations implied in online textual exchanges such as forum posts. However,recent studies in this direction only consi

Kaggle竞赛题之——Sentiment Analysis on Movie Reviews

Classify the sentiment of sentences from the Rotten Tomatoes dataset 题目链接:https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews 越来越喜欢iPython notebook了.以下所有工作都可以在一个页面上完成,FireFox支持比Chrome要好. 数据集分为train.tsv和test.tsv.字段以\t分隔,每一行有四个字段:PhraseId,Sent

Sequence Models

Sequence Models This is the fifth and final course of the deep learning specialization at Coursera which is moderated by deeplearning.ai Here are the course summary as its given on the course link: This course will teach you how to build models for n

[C7] Andrew Ng - Sequence Models

About this Course This course will teach you how to build models for natural language, audio, and other sequence data. Thanks to deep learning, sequence algorithms are working far better than just two years ago, and this is enabling numerous exciting

python库使用整理

1. 环境搭建 l  Python安装包:www.python.org l  Microsoft Visual C++ Compiler for Python l  pip(get-pip.py):pip.pypa.io/en/latest/installing.html n  pip install + 安装包          --安装包(.whl,.tar.gz,.zip) n  pip uninstall + 安装包        --卸载包 n  pip show --files +

10+ 最佳的 Node.js 教程和实例

如果你正在找Node.js的学习资料及指南,那么请继续(阅读),我们的教程将会覆盖即时聊天应用.API服务编写.投票问卷应用.人物投票APP.社交授权. Node.js on Raspberry Pi等等. 以下是Node.js入门的简单介绍,如果你对Node.js略有了解可以直接跳过此部分. 那什么是Node.js呢? Node.js是迄今运用最多的服务端JavaScript运行时环境,使用JavaScript开发跨平台的实时WEB应用. Node.js基于Google的V8 JavaScri