使用Python统计文件中词频，并且生成词云

.title { text-align: center }
.todo { font-family: monospace; color: red }
.done { color: green }
.tag { background-color: #eee; font-family: monospace; padding: 2px; font-size: 80%; font-weight: normal }
.timestamp { color: #bebebe }
.timestamp-kwd { color: #5f9ea0 }
.right { margin-left: auto; margin-right: 0px; text-align: right }
.left { margin-left: 0px; margin-right: auto; text-align: left }
.center { margin-left: auto; margin-right: auto; text-align: center }
.underline { text-decoration: underline }
#postamble p,#preamble p { font-size: 90%; margin: .2em }
p.verse { margin-left: 3% }
pre { border: 1px solid #ccc; padding: 8pt; font-family: monospace; overflow: auto; margin: 1.2em }
pre.src { position: relative; overflow: visible; padding-top: 1.2em }
pre.src::before { display: none; position: absolute; background-color: white; top: -10px; right: 10px; padding: 3px; border: 1px solid black }
pre.src:hover::before { display: inline }
pre.src-sh::before { content: "sh" }
pre.src-bash::before { content: "sh" }
pre.src-emacs-lisp::before { content: "Emacs Lisp" }
pre.src-R::before { content: "R" }
pre.src-perl::before { content: "Perl" }
pre.src-java::before { content: "Java" }
pre.src-sql::before { content: "SQL" }
table { border-collapse: collapse }
caption.t-above { caption-side: top }
caption.t-bottom { caption-side: bottom }
td,th { vertical-align: top }
th.right { text-align: center }
th.left { text-align: center }
th.center { text-align: center }
td.right { text-align: right }
td.left { text-align: left }
td.center { text-align: center }
dt { font-weight: bold }
.footpara:nth-child(0n+2) { display: inline }
.footpara { display: block }
.footdef { margin-bottom: 1em }
.figure { padding: 1em }
.figure p { text-align: center }
.inlinetask { padding: 10px; border: 2px solid gray; margin: 10px; background: #ffffcc }
#org-div-home-and-up { text-align: right; font-size: 70%; white-space: nowrap }
textarea { }
.linenr { font-size: smaller }
.code-highlighted { background-color: #ffff00 }
.org-info-js_info-navigation { border-style: none }
#org-info-js_console-label { font-size: 10px; font-weight: bold; white-space: nowrap }
.org-info-js_search-highlight { background-color: #ffff00; color: #000000; font-weight: bold }

wordcloud

1. 怎样使用Python产生词云

1 怎样使用Python产生词云

from wordcloud import WordCloud
import matplotlib.pyplot as plt
import jieba

# Now, There is no ‘word.txt‘ under this path
path_txt = "/home/alan/Desktop/word.txt"

f = open(path_txt, ‘r‘, encoding = ‘UTF-8‘).read()

cut_text = " ".join(jieba.cut(f))

wordcloud = WordCloud(
    font_path = "/home/alan/.local/share/fonts/STKAITI.TTF",
    background_color="white",
    width=1000,
    height = 800
    ).generate(cut_text)

plt.imshow(wordcloud, interpolation = "bilinear")
plt.axis("off")
plt.show()

总体思路：

导入文章
"jieba"分词
统计词频
生成并绘制词云

原文地址：https://www.cnblogs.com/alango/p/10364436.html

时间： 2024-10-07 20:39:25

使用Python统计文件中词频，并且生成词云

wordcloud

Table of Contents

1 怎样使用Python产生词云

使用Python统计文件中词频，并且生成词云的相关文章

Python 同一文件中，有unittest不执行“if name == 'main”，不生成HTMLTestRunner测试报告的解决方案

Python统计列表中的重复项出现的次数的方法

学c语言做练习之?统计文件中字符的个数

python把文件中的邮箱分类保存到相应的文件里面

Python工程文件中的名词解释---Module与Package的区别

使用IndexOf统计文件中某一词语出现次数

简单的方法来统计文件中单词和各种标点符号个数

统计文件中制定词汇出现的次数

python 统计list中各个元素出现的次数