python对于新版本elasticsearch-dsl（7.1）的使用说明

一.旧版elasticsearch-dsl（5.1）对应elasticsearch5.1.1的版本

很多同学在python搜索引擎视频中关于看到的第十章elasticsearch使用中使用python创建mapping老师使用的以下代码，这些代码对于最新版的elasticsearch-dsl的引用已经失效，会报异常错误

from datetime import datetime
from elasticsearch_dsl import Document, Date, Nested, Boolean,     analyzer, InnerDoc, Completion, Keyword, Text,Integer

from elasticsearch_dsl.analysis import CustomAnalyzer as _CustomAnalyzer

from elasticsearch_dsl.connections import connections
connections.create_connection(hosts=["localhost"])

# class CustomAnalyzer(_CustomAnalyzer):
#     def get_analysis_definition(self):
#         return {}

# ik_analyzer = CustomAnalyzer("ik_max_word", filter=["lowercase"])
class ArticleType(Document):
    #伯乐在线文章类型
    # suggest = Completion(analyzer=ik_analyzer)
    title = Text(analyzer="ik_max_word")
    create_date = Date()
    url = Keyword()
    url_object_id = Keyword()
    front_image_url = Keyword()
    front_image_path = Keyword()
    praise_nums = Integer()
    comment_nums = Integer()
    fav_nums = Integer()
    tags = Text(analyzer="ik_max_word")
    content = Text(analyzer="ik_max_word")

    class Meta:
        index = "jobbole"
        doc_type = "article"

if __name__ == "__main__":
    ArticleType.init()

二.新版的引用更正以及代码

1.最新版elasticsearch-dsl下载地址:
es-dsl对应的github地址
2.最新版构建jobbole的mapping代码

# -*- coding: utf-8 -*-
__author__ = ‘yh‘
from datetime import datetime
from elasticsearch_dsl import Document, Date, Integer, Keyword, Text, connections

# Define a default Elasticsearch client
connections.create_connection(hosts=[‘localhost‘])

class ArticleType(Document):
    #伯乐在线文章类型
    # suggest = Completion(analyzer=ik_analyzer)
    title = Text(analyzer="ik_max_word")
    create_date = Date()
    url = Keyword()
    url_object_id = Keyword()
    front_image_url = Keyword()
    front_image_path = Keyword()
    praise_nums = Integer()
    comment_nums = Integer()
    fav_nums = Integer()
    tags = Text(analyzer="ik_max_word")
    content = Text(analyzer="ik_max_word")

    class Index:
        name = ‘jobbole‘
        settings = {
          "number_of_shards": 5,
        }
# create the mappings in elasticsearch
if __name__ == "__main__":
    ArticleType.init()

前方高能

关于接下来的elasticsearch-dsl使用说明

新版elasticsearch-dsl上边是这样写

from ArticleSpider.models.es_types import ArticleType
from elasticsearch_dsl.connections import connections
# 与ElasticSearch进行连接,生成搜索建议
es = connections.create_connection(ArticleType)

新版elasticsearch-dsl下边是这样写

def gen_suggests(index,info_tuple):
    #根据字符串生成搜索建议数组
    used_words = set()
    suggests = []
    for text, weight in info_tuple:
        if text:
            #调用es的analyze接口分析字符串
            words = es.indices.analyze(index="jobbole",
                                       body={"analyzer": "ik_max_word", "text": "{0}".format(text)})
            anylyzed_words = set([r["token"] for r in words["tokens"] if len(r["token"])>1])
            new_words = anylyzed_words - used_words
        else:
            new_words = set()

        if new_words:
            suggests.append({"input":list(new_words), "weight":weight})

    return suggests

然后调用这样写

 article.suggest = gen_suggests(ArticleType, ((article.title, 10), (article.tags, 7)))

            article.save()

原文地址：https://www.cnblogs.com/yoyowin/p/12208365.html

时间： 2024-11-09 00:35:00

python对于新版本elasticsearch-dsl（7.1）的使用说明的相关文章

Python Elasticsearch DSL 查询、过滤、聚合操作实例

github.com/yongxinz/te… Elasticsearch 基本概念 Index:Elasticsearch用来存储数据的逻辑区域,它类似于关系型数据库中的database 概念.一个index可以在一个或者多个shard上面,同时一个shard也可能会有多个replicas. Document:Elasticsearch里面存储的实体数据,类似于关系数据中一个table里面的一行数据. document由多个field组成,不同的document里面同名的field一定具有相同

python urllib2导出elasticsearch数据时返回 "urllib2.HTTPError: HTTP Error 500: Internal Server Error"

0.业务场景将ES中某个index的某个字段的所有数据,导出到文件中 1.ES数据导出方法简述 ES数据导出方法,我主要找到了以下几个方面,欢迎大家补充: ES官方API:snapshot and restore module The snapshot and restore module allows to create snapshots of individual indices or an entire cluster into a remote repository like sha

[elk]elasticsearch dsl语句

sql转为dsl例子 # 每种型号车的颜色数 > 1的 SELECT model,COUNT(DISTINCT color) color_count FROM cars GROUP BY model HAVING color_count > 1 ORDER BY color_count desc LIMIT 2; GET cars/_search { "size": 0, "aggs": { "models": { "ter

在python中使用elasticsearch 需要注意的一些问题

1, py es client 使用是 http ,java api 使用是 tcp 2, es.scroll() 方法在查询多个索引的时候会报 : elasticsearch.exceptions.RequestError: RequestError(400, u'too_long_frame_exception', u'An HTTP line is larger than 4096 bytes.') 因为多个索引的时候 , _scroll_id 会很长,超过4096, 4096 是 h

python下的Elasticsearch操作

导入包 from elasticsearch import Elasticsearch 本地连接 es = Elasticsearch(['127.0.0.1:9200']) 创建索引 es.indices.create(index="python_es01",ignore=400) ingore=400 ingore是忽略的意思,400是未找到删除索引 es.indices.delete(index="python_es01") 检查索引是否存在 es.indi

Elasticsearch (DSL 布尔查询过滤器排序高亮显示

es 可以组合查询 must:查询必须匹配搜素条件比如数据库中的and should :查询至少满足条件比如数据库中的or must_not: 不匹配查询条件,一个都不要满足 must must_not should 至少要包含一个条件复合查询原文地址:https://www.cnblogs.com/loujiang/p/12701596.html

Python函数式编程：内置函数reduce 使用说明

一.概述 reduce操作是函数式编程中的重要技术之一,其作用是通过对一个集合的操作,可以从中生成一个值.比如最常见的求和,求最大值.最小值等都是reduce操作的典型例子.python通过内置reduce函数对reduce操作提供了很好的支持. 函数语法: reduce(function, iterable[,initializer]) 函数参数含义如下: 1.function 需要带两个参数,1个是用于保存操作的结果,另一个是每次迭代的元素. 2.iterable 待迭代处理的集合 3.i

Python著名的lib和开发框架（均为转载）

第一,https://github.com/vinta/awesome-python Awesome Python A curated list of awesome Python frameworks, libraries, software and resources. Inspired by awesome-php. Awesome Python Admin Panels Algorithms and Design Patterns Anti-spam Asset Management A

Elasticsearch简介与实战

什么是Elasticsearch? ??Elasticsearch是一个开源的分布式.RESTful 风格的搜索和数据分析引擎,它的底层是开源库Apache Lucene. ??Lucene 可以说是当下最先进.高性能.全功能的搜索引擎库--无论是开源还是私有,但它也仅仅只是一个库.为了充分发挥其功能,你需要使用 Java 并将 Lucene 直接集成到应用程序中. 更糟糕的是,您可能需要获得信息检索学位才能了解其工作原理,因为Lucene 非常复杂. ??为了解决Lucene使用时的繁复性,于