java使用elasticsearch分组进行聚合查询（group by）

java连接elasticsearch 进行聚合查询进行相应操作

一：对单个字段进行分组求和

1、表结构图片：

根据任务id分组，分别统计出每个任务id下有多少个文字标题

1.SQL：select id, count(*) as sum from task group by taskid;

java ES连接工具类

public class ESClientConnectionUtil {
    public static TransportClient client=null;
    public final static String HOST = "192.168.200.211"; //服务器部署
    public final static Integer PORT = 9301; //端口

    public static TransportClient  getESClient(){
        System.setProperty("es.set.netty.runtime.available.processors", "false");
        if (client == null) {
            synchronized (ESClientConnectionUtil.class) {
                try {
                    //设置集群名称
                    Settings settings = Settings.builder().put("cluster.name", "es5").put("client.transport.sniff", true).build();
                    //创建client
                    client = new PreBuiltTransportClient(settings).addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(HOST), PORT));
                } catch (Exception ex) {
                    ex.printStackTrace();

                    System.out.println(ex.getMessage());
                }
            }
        }
        return client;
    }
    public static TransportClient  getESClientConnection(){
        if (client == null) {
            System.setProperty("es.set.netty.runtime.available.processors", "false");
                try {
                    //设置集群名称
                    Settings settings = Settings.builder().put("cluster.name", "es5").put("client.transport.sniff", true).build();
                    //创建client
                    client = new PreBuiltTransportClient(settings).addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(HOST), PORT));
                } catch (Exception ex) {
                    ex.printStackTrace();
                    System.out.println(ex.getMessage());
            }
        }
        return client;
    }

    //判断索引是否存在
    public static boolean judgeIndex(String index){
        client= getESClientConnection();
         IndicesAdminClient adminClient;
        //查询索引是否存在
        adminClient= client.admin().indices();
        IndicesExistsRequest request = new IndicesExistsRequest(index);
        IndicesExistsResponse responses = adminClient.exists(request).actionGet();

        if (responses.isExists()) {
            return true;
        }
        return false;
    }
}

java ES语句（根据单列进行分组求和）

//根据 任务id分组进行求和
  SearchRequestBuilder sbuilder = client.prepareSearch("hottopic").setTypes("hot");//根据taskid进行分组统计，统计出的列别名叫sum
  TermsAggregationBuilder termsBuilder = AggregationBuilders.terms("sum").field("taskid");
 sbuilder.addAggregation(termsBuilder);
  SearchResponse responses= sbuilder.execute().actionGet();
//得到这个分组的数据集合
  Terms terms = responses.getAggregations().get("sum");
  List<BsKnowledgeInfoDTO> lists = new ArrayList<>();
for(int i=0;i<terms.getBuckets().size();i++){
    //statistics
    String id =terms.getBuckets().get(i).getKey().toString();//id
    Long sum =terms.getBuckets().get(i).getDocCount();//数量
System.out.println("=="+terms.getBuckets().get(i).getDocCount()+"------"+terms.getBuckets().get(i).getKey());
}//分别打印出统计的数量和id值

根据多列进行分组求和

//根据 任务id分组进行求和
  SearchRequestBuilder sbuilder = client.prepareSearch("hottopic").setTypes("hot");
//根据taskid进行分组统计，统计出的列别名叫sum
  TermsAggregationBuilder termsBuilder = AggregationBuilders.terms("sum").field("taskid");
//根据第二个字段进行分组
 TermsAggregationBuilder aAggregationBuilder2 = AggregationBuilders.terms("region_count").field("birthplace");//如果存在第三个，以此类推；
  sbuilder.addAggregation(termsBuilder.subAggregation(aAggregationBuilder2));
  SearchResponse responses= sbuilder.execute().actionGet();
//得到这个分组的数据集合
  Terms terms = responses.getAggregations().get("sum");
  List<BsKnowledgeInfoDTO> lists = new ArrayList<>();
for(int i=0;i<terms.getBuckets().size();i++){
    //statistics
    String id =terms.getBuckets().get(i).getKey().toString();//id
    Long sum =terms.getBuckets().get(i).getDocCount();//数量
System.out.println("=="+terms.getBuckets().get(i).getDocCount()+"------"+terms.getBuckets().get(i).getKey());
}
//分别打印出统计的数量和id值

对多个field求max/min/sum/avg

SearchRequestBuilder requestBuilder = client.prepareSearch("hottopic").setTypes("hot");
//根据taskid进行分组统计，统计别名为sum
        TermsAggregationBuilder aggregationBuilder1 = AggregationBuilders.terms("sum").field("taskid") //根据tasktatileid进行升序排列
                .order(Order.aggregation("tasktatileid", true));// 求tasktitleid 进行求平均数 别名为avg_title
        AggregationBuilder aggregationBuilder2 = AggregationBuilders.avg("avg_title").field("tasktitleid");//
        AggregationBuilder aggregationBuilder3 = AggregationBuilders.sum("sum_taskid").field("taskid");
        requestBuilder.addAggregation(aggregationBuilder1.subAggregation(aggregationBuilder2).subAggregation(aggregationBuilder3));
        SearchResponse response = requestBuilder.execute().actionGet();

        Terms aggregation = response.getAggregations().get("sum");
        Avg terms2 = null;
        Sum term3 = null;
        for (Terms.Bucket bucket : aggregation.getBuckets()) {
            terms2 = bucket.getAggregations().get("avg_title"); // org.elasticsearch.search.aggregations.metrics.avg.InternalAvg
            term3 = bucket.getAggregations().get("sum_taskid"); // org.elasticsearch.search.aggregations.metrics.sum.InternalSum
            System.out.println("编号=" + bucket.getKey() + ";平均=" + terms2.getValue() + ";总=" + term3.getValue());
        }

如上内容若有不恰当支持，请各位多多包涵并进行点评。技术在于沟通！

原文地址：https://www.cnblogs.com/chenyuanbo/p/9973311.html

时间： 2024-11-05 20:26:51

java使用elasticsearch分组进行聚合查询（group by）的相关文章

关于在elasticSearch中使用聚合查询后只显示10个bucket的问题

先看下面es查询语句 { "size": 0, "aggs" : { "all_articleId" : { "terms" : { "field" : "articleId" } } } } 得到的结果: 该索引下有2w多条数据,经过聚合分桶后,也绝对不仅仅只是10个bucket,很显然,这似乎不是我想要的结果,经过查官方API发现下面一段话: Edit Updating the an

Elasticsearch分组聚合-查询每个A_logtype下有多少数据

Elasticsearch分组聚合 1.查询指定索引下每个A_logtype有多少数据 curl -XPOST 'localhost:19200/ylchou-0-2015-10-07/_search?pretty' -d ' { "size": 0, "aggs": { "group_by_state": { "terms": { "field": "A_logtype" } } }

crm使用FetchXml分组聚合查询

/* 创建者:菜刀居士的博客 * 创建日期:2014年07月09号 */ namespace Net.CRM.FetchXml { using System; using Microsoft.Xrm.Sdk; using Microsoft.Xrm.Sdk.Query; /// <summary> /// 使用FetchXml聚合查询,分组依据 /// </summary> public class FetchXmlExtension { /// <summary> /

Elasticsearch5.0 Java Api(七) -- 聚合查询

测试聚合查询功能 1 package com.juyun.test; 2 3 import java.net.InetAddress; 4 import java.util.List; 5 6 import org.elasticsearch.action.search.SearchResponse; 7 import org.elasticsearch.client.Client; 8 import org.elasticsearch.common.settings.Settings; 9 i

Oracle和MySQL分组查询GROUP BY

Oracle和MySQL分组查询GROUP BY 真题1.Oracle和MySQL中的分组(GROUP BY)有什么区别? 答案:Oracle对于GROUP BY是严格的,所有要SELECT出来的字段必须在GROUP BY后边出现,否则会报错:“ORA-00979: not a GROUP BY expression”.而MySQL则不同,如果SELECT出来的字段在GROUP BY后面没有出现,那么会随机取出一个值,而这样查询出来的数据不准确,语义也不明确.所以,作者建议在写SQL语句的时候,

Django学习【第7篇】：Django之ORM跨表操作（聚合查询，分组查询，F和Q查询等）

django之跨表查询及添加记录一:创建表书籍模型: 书籍有书名和出版日期,一本书可能会有多个作者,一个作者也可以写多本书,所以作者和书籍的关系就是多对多的关联关系(many-to-many); 一本书只应该由一个出版商出版,所以出版商和书籍是一对多关联关系(one-to-many). 创建一对一的关系:OneToOne("要绑定关系的表名") 创建一对多的关系:ForeignKey("要绑定关系的表名") 创建多对多的关系:ManyToMany(&qu

MySQL进阶5--分组排序和分组查询 group by(having) /order by

MySQL进阶--分组排序和分组查询 group by(having) /order by /* 介绍分组函数功能:用做统计使用,又称为聚合函数或组函数 1.分类: sum, avg 求和 /平均数, 只处理数值型,都绝对忽略NULL值(avg处理时统计的个数没有null项) max ,min ,可以求字符串最大最小 ,可以匹配日期,都绝对忽略NULL值 count ,不计算NULL ,不把null算进数里 #2. 参数支持类型 SELECT MIN(last_name) ,MAX(last_

orm聚合查询、分组查询、F查询和Q查询

1.聚合查询(Avg,Count,Max,Min,Sum) Avg为求平均数,Count为求个数,Max为求最大值,Min为求最小值,Sum为求和以Avg举例 from django.db.models import Avg,Count,Max,Min,Sum ret=Book.objects.all().aggregate(Avg('price')) //其中price必须是以有的字段 2.分组查询 ? 要点: ? values在annotate前,表示group by,在annotate后

java使用elasticsearch进行模糊查询之must使用

java使用elasticsearch进行多个条件模糊查询文章说明: 1.本篇文章,本人会从java连接elasticsearch到查询结果生成并映射到具体实体类(涵盖分页功能) 2.代码背景:elasticsearch版本为:5.2.0; 3.本人以下代码是分别从两个索引中查询数据,再将两个数据进行整合,如果大家只需要分组查询,那么则选取文章中的分组查询部分代码 4.本人的实体类主要是按照layUI分页框架进行设计:实体大家可以根据自己的具体需求进行设计一.java连接elasticsea