【elasticsearch】python下的使用

有用链接：

最有用的：http://es.xiaoleilu.com/054_Query_DSL/70_Important_clauses.html

不错的博客：http://www.cnblogs.com/letong/p/4749234.html

其他1：http://www.jianshu.com/p/14aa8b09c789

1.查询索引中的所有内容

#coding=utf8
from elasticsearch import Elasticsearch

es = Elasticsearch([{‘host‘:‘x.x.x.x‘,‘port‘:9200}])
index = "test"
query = {"query":{"match_all":{}}}
resp = es.search(index, body=query)
resp_docs = resp["hits"]["hits"]
total = resp[‘hits‘][‘total‘]

print total  #总共查找到的数量
print resp_docs[0][‘_source‘][‘@timestamp‘] #输出一个字段

2.用scroll分次查询所有内容+复杂条件

过滤条件：字段A不为空且字段B不为空，且时间在过去10天~2天之间

#coding=utf8
from elasticsearch import Elasticsearch
import json
import datetime

es = Elasticsearch([{‘host‘:‘x.x.x.x‘,‘port‘:9200}])
index = "test"
query = {         "query":{             "filtered":{                 "query":{                     "bool":{                         "must_not":{"term":{"A":""}},                         "must_not":{"term":{"B":""}},                         }                     },                 "filter":{
                    "range":{‘@timestamp‘:{‘gte‘:‘now-10d‘,‘lt‘:‘now-2d‘}}
                    }
                }            }         }
resp = es.search(index, body=query, scroll="1m",size=100)
scroll_id = resp[‘_scroll_id‘]
resp_docs = resp["hits"]["hits"]
total = resp[‘hits‘][‘total‘]
count = len(resp_docs)
datas = resp_docs
while len(resp_docs) > 0:
    scroll_id = resp[‘_scroll_id‘]
    resp = es.scroll(scroll_id=scroll_id, scroll="1m")
    resp_docs = resp["hits"]["hits"]
    datas.extend(resp_docs)
    count += len(resp_docs)
    if count >= total:
        break

print len(datas)

3.聚合

查看一共有多少种@timestamp字段

#coding=utf8
from elasticsearch import Elasticsearch

es = Elasticsearch([{‘host‘:‘x.x.x.x‘,‘port‘:9200}])
index = "test"
query = {"aggs":{"all_times":{"terms":{"field":"@timestamp"}}}}
resp = es.search(index, body=query)
total = resp[‘hits‘][‘total‘]
print total
print resp["aggregations"]

时间： 2024-10-11 04:47:05

【elasticsearch】python下的使用的相关文章

python 下的crc16计算模块 XCRC16

又一次突然遇到用python处理modbus通信而需要crc16校验的问题,当时在百度上没找到,在google上找到了一个外国人开发的python包,结果安装好了之后发现校验的不正确(可能是使用的模式串不一样,xcrc16的模式串为0xa001),后来事情过去了就写了一个包弥补一下,xcrc16 的意思是 extend crc->xcrc ,也是我的第一个开源项目,如果大家使用程序遇到什么情况也麻烦通知我下,我会第一时间进行维护. 介绍: xcrc16 模块是为了解决crc16校验问题而写目前

在python下学习libsvm

1.下载libsvm,python,gnuplot(链接网上全有,压缩包自己保留着) 2.在python上的实现(主要用截图的形式展现) (1)输入命令寻求最优参数 (2) 参数c,g输出结果 gnuplot输出图像 (3)最后输入训练数据,训练数据,通过建立模型进行预测大概也就这样了,grid.py里面需要改下gnuplot的路径在python下学习libsvm,布布扣,bubuko.com

python下通过os模块和shutil模块进行文件处理方式

python下通过os模块和shutil模块进行文件处理方式得到当前工作目录路径:os.getcwd() 获取指定目录下的所有文件和目录名:os.listdir(dir) 删除文件:os.remove(file) 删除多个目录:os.removedirs(r"/home") 检测路径是否为文件:os.path.isfile(path) 检测路径是否为目录:os.path.isdir(path) 判断是否为绝对路径:os.path.isabs(path) 检测路径是否存在:os.pat

python下的MySQLdb使用

python下的MySQLdb使用 3.执行sql语句和接收返回值 cursor=conn.cursor() n=cursor.execute(sql,param) 首先,我们用使用连接对象获得一个cursor对象,接下来,我们会使用cursor提供的方法来进行工作.这些方法包括两大类:1.执行命令,2.接收返回值 cursor用来执行命令的方法: callproc(self, procname, args):用来执行存储过程,接收的参数为存储过程名和参数列表,返回值为受影响的行数 execut

sae Python下设置定时任务

官方文档在这里:http://sae.sina.com.cn/doc/python/cron.html 就是通过在config.yaml文件中添加Cron段,例如: cron: - description: timing_task url: /on_time schedule: "*/5 * * * *" 代表每5分钟以get方式访问/on_time这个链接. 还可以结合sae中的Taskqueue服务把大任务分成小任务,因为sae对于每次访问有时间限制,不能超过300秒. 提醒:冒号

Python下Json和Msgpack序列化比较

Python下Json和Msgpack序列化比较最近用Python时,遇到了序列化对象的问题,传统的json和新型序列化工具包msgpack都有涉及,于是做一个简单的总结: 通俗的讲:序列化:将对象信息转化为可以存储或传输的形式:反序列化:把这个存储的内容还原成对象. json就不用多做解释了,是一种轻量级的数据交换格式,广泛应用于web开发中.当然也是将对象序列化成符合json规范的格式.网上有一堆堆资料. 官网:http://www.json.org msgpack就有意思了,先看下官方

python下的复杂网络编程包networkx的安装及使用

由于py3.x与工具包的兼容问题,这里采用py2.7 1.python下的复杂网络编程包networkx的使用: http://blog.sina.com.cn/s/blog_720448d301018px7.html 处理1里面提到的那四个安装包还要: 2.需要安装 setuptools: http://wenku.baidu.com/link?url=XL2qKVZbDPh-XocJW7OVZmacM4Tio5YhCyu0Uw-E7CjhiXRrhSWI4xheERjEVC3olCZ8muN

python下异常处理

1.python下异常如何处理: 1 #encoding=utf-8 2 3 """ 4 python遇到异常,程序直接运行 5 try: 6 "判断有可能抛出异常的代码" 7 print "haha" 8 except: 9 "异常下运行的代码" 10 else: 11 "运行没有异常时候的逻辑" 12 finally: 13 "不管try判断如何,该代码总会执行" 14 1

python下线程以及锁

1.python多线程 1 #encoding=utf-8 2 """ 3 python多线程,并非真正意义上的多线程 4 全局锁:在指定时间里,有且只有一个线程在运行 5 6 7 """ 8 import threading 9 import time 10 11 def test(p): 12 time.sleep(0.1) 13 print p 14 15 # a = threading.Thread(target=test) 16 # b

python下setuptools安装

python下的setuptools带有一个easy_install的工具, 在安装python的每三方模块.工具时很有用,也很方便. 安装setuptools前先安装pip, 请参见<python下pip的安装> 1. 下载: 在它的官网可以下载到安装包: https://pypi.python.org/pypi/setuptools 页面最下面的是它的安装链接,如: $wget --no-check-certificate https://pypi.python.org/packages/