加载自定义语料库:
1 from nltk.corpus import PlaintextCorpusReader 2 corpus_root = ‘/tmp‘ #路径 3 wordlists = PlaintextCorpusReader(corpus_root, ‘.*‘) #可以是a.txt 4 wordlists.fileids()
时间: 2024-08-27 04:37:05
加载自定义语料库:
1 from nltk.corpus import PlaintextCorpusReader 2 corpus_root = ‘/tmp‘ #路径 3 wordlists = PlaintextCorpusReader(corpus_root, ‘.*‘) #可以是a.txt 4 wordlists.fileids()