自然语言19.1_Lemmatizing with NLTK

https://www.pythonprogramming.net/lemmatizing-nltk-tutorial/?completed=/named-entity-recognition-nltk-tutorial/

Lemmatizing with NLTK

A very similar operation to stemming is called lemmatizing. The
major difference between these is, as you saw earlier, stemming can
often create non-existent words, whereas lemmas are actual words.

So, your root stem, meaning the word you end up with, is not
something you can just look up in a dictionary, but you can look up a
lemma.

Some times you will wind up with a very similar word, but sometimes,
you will wind up with a completely different word. Let‘s see some
examples.

from nltk.stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()

print(lemmatizer.lemmatize("cats"))
print(lemmatizer.lemmatize("cacti"))
print(lemmatizer.lemmatize("geese"))
print(lemmatizer.lemmatize("rocks"))
print(lemmatizer.lemmatize("python"))
print(lemmatizer.lemmatize("better", pos="a"))
print(lemmatizer.lemmatize("best", pos="a"))
print(lemmatizer.lemmatize("run"))
print(lemmatizer.lemmatize("run",‘v‘))

Here, we‘ve got a bunch of examples of the lemma for the words that we use. The only major thing to note is that lemmatize takes a part of speech parameter, "pos." If not supplied, the default is "noun." This means that an attempt will be made to find the closest noun, which can create trouble for you. Keep this in mind if you use lemmatizing!

In the next tutorial, we‘re going to dive into the NTLK corpus that came with the module, looking at all of the awesome documents they have waiting for us there.

时间： 2024-10-29 10:45:51

自然语言19.1_Lemmatizing with NLTK的相关文章

自然语言处理(1)之NLTK与PYTHON

自然语言处理(1)之NLTK与PYTHON 题记: 由于现在的项目是搜索引擎,所以不由的对自然语言处理产生了好奇,再加上一直以来都想学Python,只是没有机会与时间.碰巧这几天在亚马逊上找书时发现了这本<Python自然语言处理>,瞬间觉得这对我同时入门自然语言处理与Python有很大的帮助.所以最近都会学习这本书,也写下这些笔记. 1. NLTK简述 NLTK模块及功能介绍语言处理任务 NLTK模块功能描述获取语料库 nltk.corpus 语料库和词典的标准化接口字符串处理 nl

自然语言13_Stop words with NLTK

https://www.pythonprogramming.net/stop-words-nltk-tutorial/?completed=/tokenizing-words-sentences-nltk-tutorial/ Stop words with NLTK The idea of Natural Language Processing is to do some form of analysis, or processing, where the machine can underst

自然语言23_Text Classification with NLTK

https://www.pythonprogramming.net/text-classification-nltk-tutorial/?completed=/wordnet-nltk-tutorial/ Text Classification with NLTK Now that we're comfortable with NLTK, let's try to tackle text classification. The goal with text classification can

自然语言14_Stemming words with NLTK

https://www.pythonprogramming.net/stemming-nltk-tutorial/?completed=/stop-words-nltk-tutorial/ Stemming words with NLTK The idea of stemming is a sort of normalizing method. Many variations of words carry the same meaning, other than when tense is in

自然语言20_The corpora with NLTK

https://www.pythonprogramming.net/nltk-corpus-corpora-tutorial/?completed=/lemmatizing-nltk-tutorial/ The corpora with NLTK In this part of the tutorial, I want us to take a moment to peak into the corpora we all downloaded! The NLTK corpus is a mass

Python自然语言处理实践: 在NLTK中使用斯坦福中文分词器

http://www.52nlp.cn/python%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E5%AE%9E%E8%B7%B5-%E5%9C%A8nltk%E4%B8%AD%E4%BD%BF%E7%94%A8%E6%96%AF%E5%9D%A6%E7%A6%8F%E4%B8%AD%E6%96%87%E5%88%86%E8%AF%8D%E5%99%A8 原文地址:https://www.cnblogs.com/lhuser/p/

自然语言0_nltk中文使用和学习资料汇总

http://blog.csdn.net/huyoo/article/details/12188573 nltk是一个Python工具包, 用来处理和自然语言处理相关的东西. 包括分词(tokenize), 词性标注(POS), 文本分类, 等等现成的工具. 1. nltk的安装资料1.1: 黄聪:Python+NLTK自然语言处理学习(一):环境搭建 http://www.cnblogs.com/huangcong/archive/2011/08/29/2157437.html 这个图

Python深度学习自然语言处理工具Stanza试用！这也太强大了吧！

众所周知, 斯坦福大学自然语言处理组出品了一系列NLP工具包,但是大多数都是用Java写得,对于Python用户不是很友好.几年前我曾基于斯坦福Java工具包和NLTK写过一个简单的中文分词接口: Python自然语言处理实践: 在NLTK中使用斯坦福中文分词器 ,不过用起来也不是很方便.深度学习自然语言处理时代,斯坦福大学自然语言处理组开发了一个纯Python版本的深度学习NLP工具包: Stanza - A Python NLP Library for Many Human Languag

自然语言1_介绍

相同爱好者请加 QQ:231469242 seo 关键词自然语言,NLP,nltk,python,tokenization,normalization,linguistics,semantic 单词: NLP:natural language processing 自然语言处理 tokenization 词语切分 normalization 标准化(去除标点,大小写统一 ) nltk:natural language toolkit (Python)自然语言工具包 corp

猜你喜欢

Android控件——AutoCompleteTextView与MultiAutoCompleteTextView(实现自动匹配输入的内容)

------------------------------------AutoCompleteTextView---------------------- 1.使用方法布局文件 <Linea ...

javascript继承有5种实现方式

1.对象冒充 function Parent(username){ this.username = username; this.hello = function(){ alert(this.user ...

删除或清空具有外键约束的表数据报-ERROR 1701 (42000)

OS: centos 6.3 DB:5.5.14 mysql> select database();+------------+| database() |+------------+| sa ...

HTML基础教程（12）——HTML图像--

通过使用 HTML,可以在文档中显示图像. 实例插入图像本例演示如何在网页中显示图像. 从不同的位置插入图片本例演示如何将其他文件夹或服务器的图片显示到网页中. (可以在本页底端找到更多实例.) ...

大数据与机器学习2016年中盘点

本文将分以下几个部分进行盘点.一.里程碑事件二.开源项目(国际篇)三.业界动态(国际篇)四.开源项目(国内篇)五.业界动态(国内篇)六.下半年展望七.周报集锦

javascript方法--bind()

bind方法,顾名思义,就是绑定的意思,到底是怎么绑定然后怎么用呢,下面就来说说我对这个方法的理解. 语法 fun.bind(this,arg1,arg2,...) bind()方法会创建一个新的函数 ...

Jump Game 解答

Question Given an array of non-negative integers, you are initially positioned at the first index of ...

公司注册流程

我会在接下来的注册过程中逐步完善该流程. 为公司取一个名字填写已准备好的公司名称,由工商局上网(工商局内部网)检索是否有重名,如果没有重名,就可以使用这个名称,审核通过后领取<企业名称预先核准 ...

Java线程学习整理--3--简单的死锁例子

1.线程死锁的概念: 简单地理解下吧! 我们都知道,线程在执行的过程中是占着CPU的资源的,当多个线程都需要一个被锁住的条件才能结束的时候,死锁就产生了! 还有一个经典的死锁现象: 经典的“哲学家就餐 ...

5月27日の勉強レポート

やっと入国管理局から手紙をもらった.変更完了ではなく.学校の成績と出席率が必要だというものです.それでも大喜んだ.すぐ変更済みと感じられる.そろそろいろいろなことを準備しなきゃ. 日本語授業中:単語: ...

【Mongodb教程第十三课】PHP mongodb 的增删改查使用

<pre> <?php #phpinfo();die; #其他链接方式 #$conn=new Mongo(); #连接本地主机,默认端口. #$conn=new Mongo(&quo ...

浅谈集合

1.集合和数组的区别集合:长度可变,可以存放不同类型的元素,只能存放引用类型数组:长度固定,只可以存放相同的同种类型的元素,可以存放数据类型也可以存放引用类型 2.java工具类如下图: . ...

PACS系统简易

PACS系统 http://baike.baidu.com/link?url=prHBMbyu5W98ET1UGQ0PXXxLebxAeljckFH0pfO_2aODe1UgsrWgRd4Unbopt ...

hihocoder 1015 KMP(找多个位置的)

#1015 : KMP算法时间限制:1000ms 单点时限:1000ms 内存限制:256MB 描述小Hi和小Ho是一对好朋友,出生在信息化社会的他们对编程产生了莫大的兴趣,他们约定好互相帮助,在 ...

中级篇第七期：ScrollView常用练习

那么小弟这次的练习就是在ScrllView里面放入两个TableView,然后通过ScrollView的滑动实现两个TableView的互转,接下来再增加两个Button,来实现选中与非选中,然后关联 ...

网络设备配置与管理----配置静态路由实现两个公司网络互联

理论学习问题1:静态路由和动态路由的区别? 1>静态路由是在路由器中设置的固定路由,动态路由靠配置的动态路由协议进行的自动路由计算和路由表项更新. 2>静态路由的故障解决方式不如动态路由 ...

Linux 启动程序后台运行

有两种方式: 1. command & : 后台运行,你关掉终端会停止运行 2. nohup command & : 后台运行,你关掉终端也会继续运行

Swift版PhotoStackView——照片叠放视图

前言之前流行过一种图片展示视图--photo stack,即照片叠放视图.大致上是这个样子的: (图片出自code4app) 现在我们已经能够使用UICollectionViewLayout来实现这 ...

windows_learn 003 NDPI

内容总览 NetBios DNS PKI IPsec NetBios Windows下用来将计算机名与IP之间做解析的一个服务 NetBios名称计算机名域名工作组名 ... NetBios名称 ...

18.3.1获得Class对象

package d18_3_1; /** * Java中的java.lang.Class,简单理解就是为每个java对象的类型标识的类, * 虚拟机使用运行时类型信息选择正确的执行方法,用来保存这些运 ...

专题

随机推荐

© 2024 憋错料 | info#biecuoliao.com | 10 q. 0.025 s.