jdk8 stream实现sql单表select a,b,sum(),avg(),max() from group by a,b order by a,b limit M offset N及其性能

之所以要测该场景,是因为merge多数据源结果的时候,有时候只是单个子查询结果了,而此时采用sql数据库处理并不一定能够合理(网络延迟太大)。

测试数据10万行,结果1000行

limit 20 offset 0的延时如下:

package com.hundsun.ta.base.service;

import com.hundsun.ta.utils.JsonUtils;
import lombok.AllArgsConstructor;
import lombok.NoArgsConstructor;

import java.math.BigDecimal;
import java.util.*;
import java.util.stream.Collectors;

import static java.util.stream.Collectors.*;

/**
 * @author zjhua
 * @description
 * @date 2019/10/3 15:35
 */
public class JavaStreamCommonSQLTest {
    public static void main(String[] args) {
        List<Person> persons = new ArrayList<>();
        for (int i=100000;i>0;i--) {
            persons.add(new Person("Person " + (i+1)%1000, i % 100, i % 1000,new BigDecimal(i),i));
        }
        System.out.println(System.currentTimeMillis());
        Map<String,Map<Integer, Data>> result = persons.stream().collect(
                groupingBy(Person::getName,Collectors.groupingBy(Person::getAge,
                        collectingAndThen(summarizingDouble(Person::getQuantity),
                                dss -> new Data((long)dss.getAverage(), (long)dss.getSum())))));
        List<ResultGroup> list = new ArrayList<>();
        result.forEach((k,v)->{
            v.forEach((ik,iv)->{
                ResultGroup e = new ResultGroup(k,ik,iv.average,iv.sum);
                list.add(e);
            });
        });
        list.sort(Comparator.comparing(ResultGroup::getSum).thenComparing(ResultGroup::getAverage));
        list.subList(0,20);
        System.out.println(System.currentTimeMillis());
        System.out.println(JsonUtils.toJson(list));
    }
}

@[email protected]@AllArgsConstructor
class Person {
    String name;
    int group;
    int age;
    BigDecimal balance;
    double quantity;
}

@[email protected]@AllArgsConstructor
@Deprecated
class ResultGroup {
    String name;
    int group;
    long average;
    long sum;
}
class Data {
    long average;
    long sum;

    public Data(long average, long sum) {
        this.average = average;
        this.sum = sum;
    }

}

开始:1570093479002
结束:1570093479235  --200多毫秒

测试数据10万行,结果90000行

limit 20 offset 10000的延时如下:

package com.hundsun.ta.base.service;

import com.hundsun.ta.utils.JsonUtils;
import lombok.AllArgsConstructor;
import lombok.NoArgsConstructor;

import java.math.BigDecimal;
import java.util.*;
import java.util.stream.Collectors;

import static java.util.stream.Collectors.*;

/**
 * @author zjhua
 * @description
 * @date 2019/10/3 15:35
 */
public class JavaStreamCommonSQLTest {
    public static void main(String[] args) {
        List<Person> persons = new ArrayList<>();
        for (int i=100000;i>0;i--) {
            persons.add(new Person("Person " + (i+1)%1000, i>90000 ? i%10000:i, i % 1000,new BigDecimal(i),i));
        }
        System.out.println(System.currentTimeMillis());
        Map<String,Map<Integer, Data>> result = persons.stream().collect(
                groupingBy(Person::getName,Collectors.groupingBy(Person::getGroup,
                        collectingAndThen(summarizingDouble(Person::getQuantity),
                                dss -> new Data((long)dss.getAverage(), (long)dss.getSum())))));
        List<ResultGroup> list = new ArrayList<>();
        result.forEach((k,v)->{
            v.forEach((ik,iv)->{
                ResultGroup e = new ResultGroup(k,ik,iv.average,iv.sum);
                list.add(e);
            });
        });
        list.sort(Comparator.comparing(ResultGroup::getSum).thenComparing(ResultGroup::getAverage));
        System.out.println(list.size());
        list.subList(10000,10020);
        System.out.println(System.currentTimeMillis());
        System.out.println(JsonUtils.toJson(list));
    }
}

@[email protected]@AllArgsConstructor
class Person {
    String name;
    int group;
    int age;
    BigDecimal balance;
    double quantity;
}

@[email protected]@AllArgsConstructor
@Deprecated
class ResultGroup {
    String name;
    int group;
    long average;
    long sum;
}
class Data {
    long average;
    long sum;

    public Data(long average, long sum) {
        this.average = average;
        this.sum = sum;
    }

}

开始:1570093823404

结束:1570093823758  -- 350多毫秒

总的来说,到现在为止,java stream还无法较低成本的直接替换sql,比如典型的group by 多个字段不支持,需要多级map(不仅复杂,性能也低),而且group by的统计i结果还必须在单独的类中。开发成本就太高。

参考:https://stackoverflow.com/questions/32071726/java-8-stream-groupingby-with-multiple-collectors

原文地址:https://www.cnblogs.com/zhjh256/p/11619840.html

时间: 2024-10-31 09:48:39

jdk8 stream实现sql单表select a,b,sum(),avg(),max() from group by a,b order by a,b limit M offset N及其性能的相关文章

sql学习总结(3)——SQL单表查询技术

基本格式: select [all | distinct] select_list from table_list/view_list [where conditions] [group by group_list] [having conditions] [order by order_list] 例: select 职工号,姓名,工资 as 月工资,(工资*12+5000)/6 as 年奖金 from 职工 select *from 职工 where 仓库号 in (‘wh1’,’wh2’)

sql 单表distinct/多表group by查询去除重复记录

单表distinct 多表group by group by 必须放在 order by 和 limit之前,不然会报错 下面先来看看例子: table   id name   1 a   2 b   3 c   4 c   5 b 库结构大概这样,这只是一个简单的例子,实际情况会复杂得多. 比如我想用一条语句查询得到name不重复的所有数据,那就必须使用distinct去掉多余的重复记录. select distinct name from table得到的结果是: name   a   b 

SQL 单表分页存储过程和单表多字段排序和任意字段分页存储过程

  第一种:单表多字段排序分页存储过程       --支持单表多字段查询,多字段排序 create PROCEDURE [dbo].[UP_GetByPageFiledOrder] ( @TableName varchar(50), --表名 @ReFieldsStr varchar(200) = '*', --字段名(全部字段为*) @OrderString varchar(200), --排序字段(必须!支持多字段不用加order by) @WhereString varchar(500)

sql 单表/多表查询去除重复记录

单表distinct 多表group by group by 必须放在 order by 和 limit之前,不然会报错 ************************************************************************************ 1.查找表中多余的重复记录,重复记录是根据单个字段(peopleId)来判断 select * from peoplewhere peopleId in (select peopleId from peopl

SQL单表查询

---恢复内容开始--- SELECT语句格式: SELECT [ALL|DISTINCT] <目标列表达式>[,<目标列表达式>] -FROM <表名或视图名>[,<表名或视图名> ]-|(SELECT 语句)[AS]<别名>[ WHERE <条件表达式> ][ GROUP BY <列名1> [ HAVING <条件表达式> ] ][ ORDER BY <列名2> [ ASC|DESC ] ];

sql单表中某一字段重复,取最近3条或几条数据

select a.* from tablename a left join tablename b on a.uid=b.uid and a.id>=b.id group by a.id,a.name,a.uid having count(b.id)<=3 order by a.uid,a.id; sql查询语句,针对需求:一个表中某一字段是有重复的数据,针对该字段相同的值只取最近的3条或要求的几条: --记录铭心!

SQL单表查询练习部分总结

--查询10号部门的所有经理,20部门的所有销售员. SELECT * FROM emp WHERE (deptno = 10 AND job = '经理') OR (deptno = 20 AND job = '销售员') --既不是经理也不是销售员 job NOT IN ('经理', '销售员') --奖金为空 comm IS NULL --三个字的名字 name LIKE '___' --2000年入职的 hiredate LIKE '2000-%' --工资降序排序,如果工资相同的使用入

java8 stream collect (收集)toList toSet toHashSet count sum avg max min

MySQL索引优化(索引单表优化案例)

1.单表查询优化 建表SQL CREATE TABLE IF NOT EXISTS `article` ( `id` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, `author_id` INT(10) UNSIGNED NOT NULL, `category_id` INT(10) UNSIGNED NOT NULL, `views` INT(10) UNSIGNED NOT NULL, `comments` INT(10) UNS