转载:postgresql分区与优化

--对于分区表constraint_exclusion 这个参数需要配置为partition或on
postgres=# show constraint_exclusion ;
 constraint_exclusion
----------------------
 partition

 --创建父子表, 用于存储分区数据
create table t(id int primary key);
create table t1(like t including all) inherits(t);
create table t2(like t including all) inherits(t);
create table t3(like t including all) inherits(t);
create table t4(like t including all) inherits(t);
--PostgreSQL的子表和子表之间的约束是没有任何关系的, 所以也可以有重叠, 即非全局约束.
 alter table t1 add constraint ck_t1_1 check(id<0);
 alter table t2 add constraint ck_t2_1 check(id>=0 and id<100);
 alter table t3 add constraint ck_t3_1 check(id>=100 and id<200);
 alter table t4 add constraint ck_t4_1 check(id>=200);

 --分区字段传入常量, 执行时扫描的是父表和约束对应的子表 :
postgres=#  explain select * from t where id=10;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Append  (cost=0.00..8.17 rows=2 width=4)
   ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=4)
         Filter: (id = 10)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
(5 rows)

--分区字段传入常量, 执行时扫描的是父表和约束对应的子表;
postgres=#  prepare p_test as select * from t where id=$1;
PREPARE
postgres=# explain execute p_test(1);
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Append  (cost=0.00..8.17 rows=2 width=4)
   ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=4)
         Filter: (id = 1)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 1)
(5 rows)

--子句查询, 执行时扫描的是父表和所有子表, 注意这里使用的子查询是子表的查询, 理论上应该是扫描父表和该子表
postgres=#  explain select * from t where id=(select id from t1 limit 1);
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Append  (cost=0.01..32.70 rows=5 width=4)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..0.01 rows=1 width=4)
           ->  Seq Scan on t1 t1_1  (cost=0.00..34.00 rows=2400 width=4)
   ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=4)
         Filter: (id = $0)
   ->  Index Only Scan using t1_pkey on t1  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = $0)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = $0)
   ->  Index Only Scan using t3_pkey on t3  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = $0)
   ->  Index Only Scan using t4_pkey on t4  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = $0)
(14 rows)

--综上可知在对分区表进行查询时最好使用字面常量,而不要使用子查询之类复杂的sql

--如果子表上约束删除,则pg不得不把删除约束的子表也加入到查询中(即使子表可以忽略)
alter table t4 drop constraint ck_t4_1;
postgres=#  explain select * from t where id=10;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Append  (cost=0.00..16.34 rows=3 width=4)
   ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=4)
         Filter: (id = 10)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
   ->  Index Only Scan using t4_pkey on t4  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
(7 rows)

--如果constraint_exclusion设置为off,pg不得不进行全表扫描
postgres=# set constraint_exclusion=off;
SET
postgres=#  explain select * from t where id=10;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Append  (cost=0.00..32.69 rows=5 width=4)
   ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=4)
         Filter: (id = 10)
   ->  Index Only Scan using t1_pkey on t1  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
   ->  Index Only Scan using t2_pkey on t2  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
   ->  Index Only Scan using t3_pkey on t3  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
   ->  Index Only Scan using t4_pkey on t4  (cost=0.15..8.17 rows=1 width=4)
         Index Cond: (id = 10)
(11 rows)

--分区表上一般针对分区建立相对应的分区索引
--建在父表的索引为全局索引,但如果你表没有数据要查询子表时,则分区表要进行全表扫描

--父表建立的全局索引
postgres=# \d+ p
                                       Table "public.p"
  Column   |              Type              | Modifiers | Storage | Stats target | Description
-----------+--------------------------------+-----------+---------+--------------+-------------
 city_id   | integer                        | not null  | plain   |              |
 logtime   | timestamp(0) without time zone | not null  | plain   |              |
 peaktemp  | integer                        |           | plain   |              |
 unitsales | integer                        |           | plain   |              |
Indexes:
    "idx_city_id" btree (city_id)
    "idx_p_logtime" btree (logtime)
Triggers:
    delete_p_trigger BEFORE DELETE ON p FOR EACH ROW EXECUTE PROCEDURE p_delete_trigger()
    insert_p_trigger BEFORE INSERT ON p FOR EACH ROW EXECUTE PROCEDURE p_insert_trigger()
Child tables: p_201201,
              p_201202,
              p_201203,
              p_201204,
              p_201205,
              p_201206,
              p_201207,
              p_201208,
              p_201209,
              p_201210,
              p_201211,
              p_201212,
              p_default
Has OIDs: no

--分区没有索引,不能使用父表索引
postgres=# explain select * from p_201202 where city_id=2 and logtime=timestamp ‘2012-02-02 12:59:59‘;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Seq Scan on p_201202  (cost=0.00..214.01 rows=2 width=20)
   Filter: ((city_id = 2) AND (logtime = ‘2012-02-02 12:59:59‘::timestamp without time zone))
(2 rows)

--建立分区索引,可以使用分区索引
postgres=# CREATE INDEX idx_p_201202_city_id ON p_201202 (city_id);
CREATE INDEX
postgres=# explain select * from p_201202 where city_id=2 and logtime=timestamp ‘2012-02-02 12:59:59‘;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Index Scan using idx_p_201202_city_id on p_201202  (cost=0.29..8.33 rows=2 width=20)
   Index Cond: (city_id = 2)
   Filter: (logtime = ‘2012-02-02 12:59:59‘::timestamp without time zone)

--也可以指定只查询父表的数据

postgres=# select * from only p;
 city_id | logtime | peaktemp | unitsales
---------+---------+----------+-----------
(0 rows)

--如果一个分区表,父子表之间不再有继承关系,则查询父表时不再过滤到子表
postgres=# alter table t3 no inherit t;
ALTER TABLE
postgres=# explain select count(*) from t;
                            QUERY PLAN
------------------------------------------------------------------
 Aggregate  (cost=73.50..73.51 rows=1 width=0)
   ->  Append  (cost=0.00..62.80 rows=4281 width=0)
         ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=0)
         ->  Seq Scan on t1  (cost=0.00..31.40 rows=2140 width=0)
         ->  Seq Scan on t2  (cost=0.00..31.40 rows=2140 width=0)
(5 rows)

--再次添加继承,查询父表可以过滤到子表
postgres=# alter table t3 inherit t;
ALTER TABLE
postgres=# explain select count(*) from t;
                            QUERY PLAN
------------------------------------------------------------------
 Aggregate  (cost=110.25..110.26 rows=1 width=0)
   ->  Append  (cost=0.00..94.20 rows=6421 width=0)
         ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=0)
         ->  Seq Scan on t1  (cost=0.00..31.40 rows=2140 width=0)
         ->  Seq Scan on t2  (cost=0.00..31.40 rows=2140 width=0)
         ->  Seq Scan on t3  (cost=0.00..31.40 rows=2140 width=0)
(6 rows)

--以下为p表测试数据代码
CREATE TABLE p (
    city_id         int not null,
    logtime         timestamp(0) not null,
    peaktemp        int,
    unitsales       int
);

CREATE INDEX idx_p_logtime ON p (logtime);

CREATE TABLE p_201201 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201202 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201203 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201204 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201205 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201206 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201207 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201208 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201209 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201210 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201211 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_201212 (LIKE p INCLUDING all) INHERITS (p);
CREATE TABLE p_default (LIKE p INCLUDING all) INHERITS (p);

CREATE OR REPLACE FUNCTION p_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF    ( NEW.logtime >= DATE ‘2012-01-01‘ AND NEW.logtime < DATE ‘2012-02-01‘ ) THEN
        INSERT INTO p_201201 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-02-01‘ AND NEW.logtime < DATE ‘2012-03-01‘ ) THEN
        INSERT INTO p_201202 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-03-01‘ AND NEW.logtime < DATE ‘2012-04-01‘ ) THEN
        INSERT INTO p_201203 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-04-01‘ AND NEW.logtime < DATE ‘2012-05-01‘ ) THEN
        INSERT INTO p_201204 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-05-01‘ AND NEW.logtime < DATE ‘2012-06-01‘ ) THEN
        INSERT INTO p_201205 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-06-01‘ AND NEW.logtime < DATE ‘2012-07-01‘ ) THEN
        INSERT INTO p_201206 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-07-01‘ AND NEW.logtime < DATE ‘2012-08-01‘ ) THEN
        INSERT INTO p_201207 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-08-01‘ AND NEW.logtime < DATE ‘2012-09-01‘ ) THEN
        INSERT INTO p_201208 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-09-01‘ AND NEW.logtime < DATE ‘2012-10-01‘ ) THEN
        INSERT INTO p_201209 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-10-01‘ AND NEW.logtime < DATE ‘2012-11-01‘ ) THEN
        INSERT INTO p_201210 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-11-01‘ AND NEW.logtime < DATE ‘2012-12-01‘ ) THEN
        INSERT INTO p_201211 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2012-12-01‘ AND NEW.logtime < DATE ‘2013-01-01‘ ) THEN
        INSERT INTO p_201212 VALUES (NEW.*);
    ELSIF ( NEW.logtime >= DATE ‘2013-01-01‘ OR NEW.logtime < DATE ‘2012-01-01‘ ) THEN
        INSERT INTO p_default VALUES (NEW.*);
    ELSE
        RAISE EXCEPTION ‘Date out of range.  Fix the p_insert_trigger() function!‘;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION p_delete_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF    ( OLD.logtime >= DATE ‘2012-01-01‘ AND OLD.logtime < DATE ‘2012-02-01‘ ) THEN
        DELETE FROM p_201201 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-02-01‘ AND OLD.logtime < DATE ‘2012-03-01‘ ) THEN
        DELETE FROM p_201202 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-03-01‘ AND OLD.logtime < DATE ‘2012-04-01‘ ) THEN
        DELETE FROM p_201203 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-04-01‘ AND OLD.logtime < DATE ‘2012-05-01‘ ) THEN
        DELETE FROM p_201204 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-05-01‘ AND OLD.logtime < DATE ‘2012-06-01‘ ) THEN
        DELETE FROM p_201205 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-06-01‘ AND OLD.logtime < DATE ‘2012-07-01‘ ) THEN
        DELETE FROM p_201206 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-07-01‘ AND OLD.logtime < DATE ‘2012-08-01‘ ) THEN
        DELETE FROM p_201207 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-08-01‘ AND OLD.logtime < DATE ‘2012-09-01‘ ) THEN
        DELETE FROM p_201208 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-09-01‘ AND OLD.logtime < DATE ‘2012-10-01‘ ) THEN
        DELETE FROM p_201209 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-10-01‘ AND OLD.logtime < DATE ‘2012-11-01‘ ) THEN
        DELETE FROM p_201210 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-11-01‘ AND OLD.logtime < DATE ‘2012-12-01‘ ) THEN
        DELETE FROM p_201211 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2012-12-01‘ AND OLD.logtime < DATE ‘2013-01-01‘ ) THEN
        DELETE FROM p_201212 WHERE logtime=OLD.logtime;
    ELSIF ( OLD.logtime >= DATE ‘2013-01-01‘ OR OLD.logtime < DATE ‘2012-01-01‘ ) THEN
        DELETE FROM p_default WHERE logtime=OLD.logtime;
    ELSE
        RAISE EXCEPTION ‘Date out of range.  Fix the p_insert_trigger() function!‘;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER insert_p_trigger
    BEFORE INSERT ON p
    FOR EACH ROW EXECUTE PROCEDURE p_insert_trigger();

CREATE TRIGGER delete_p_trigger
    BEFORE DELETE ON p
    FOR EACH ROW EXECUTE PROCEDURE p_delete_trigger();

INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (1, timestamp ‘2012-01-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (2, timestamp ‘2012-02-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (3, timestamp ‘2012-03-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (4, timestamp ‘2012-04-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (5, timestamp ‘2012-05-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (6, timestamp ‘2012-06-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (7, timestamp ‘2012-07-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (8, timestamp ‘2012-08-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (9, timestamp ‘2012-09-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (10, timestamp ‘2012-10-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (11, timestamp ‘2012-11-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (12, timestamp ‘2012-12-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (13, timestamp ‘2013-01-02 12:59:59‘, 20, 10);
INSERT INTO p (city_id, logtime, peaktemp, unitsales) VALUES (14, timestamp ‘2011-12-02 12:59:59‘, 20, 10);

INSERT INTO p (city_id, logtime, peaktemp, unitsales) select m, timestamp ‘2012-02-02 12:59:59‘, 20, 10 from generate_series(1,10000) m;

explain select * from p_201202 where city_id=2 and logtime=timestamp ‘2012-02-02 12:59:59‘;
转载:https://yq.aliyun.com/articles/2637?spm=5176.100240.searchblog.12.59Jibq#
时间: 2024-10-10 13:17:02

转载:postgresql分区与优化的相关文章

转载 50种方法优化SQL Server数据库查询

原文地址 http://www.cnblogs.com/zhycyq/articles/2636748.html 50种方法优化SQL Server数据库查询 查询速度慢的原因很多,常见如下几种: 1.没有索引或者没有用到索引(这是查询慢最常见的问题,是程序设计的缺陷) 2.I/O吞吐量小,形成了瓶颈效应. 3.没有创建计算列导致查询不优化. 4.内存不足 5.网络速度慢 6.查询出的数据量过大(可以采用多次查询,其他的方法降低数据量) 7.锁或者死锁(这也是查询慢最常见的问题,是程序设计的缺陷

PostgreSQL分区介绍

PostgreSQL支持基本的表分区功能.本文描述为什么需要表分区以及如何在数据库设计中使用表分区. 1. 概述 分区的意思是把逻辑上的一个大表分割成物理上的几块.分区可以提供若干好处: 某些类型的查询性能可以得到极大提升.特别是表中访问率较高的行位于一个单独分区或少数几个分区上的情况下.分区可以减少索引体积从而可以将高使用率部分的索引存放在内存中.如果索引不能全部放在内存中,那么在索引上的读和写都会产生更多的磁盘访问. 当查询或更新一个分区的大部分记录时,连续扫描那个分区而不是使用索引离散的访

【转载】 Spark性能优化指南——基础篇

前言 开发调优 调优概述 原则一:避免创建重复的RDD 原则二:尽可能复用同一个RDD 原则三:对多次使用的RDD进行持久化 原则四:尽量避免使用shuffle类算子 原则五:使用map-side预聚合的shuffle操作 原则六:使用高性能的算子 原则七:广播大变量 原则八:使用Kryo优化序列化性能 原则九:优化数据结构 资源调优 调优概述 Spark作业基本运行原理 资源参数调优 写在最后的话 前言 在大数据计算领域,Spark已经成为了越来越流行.越来越受欢迎的计算平台之一.Spark的

【转载】 Spark性能优化:资源调优篇

在开发完Spark作业之后,就该为作业配置合适的资源了.Spark的资源参数,基本都可以在spark-submit命令中作为参数设置.很多Spark初学者,通常不知道该设置哪些必要的参数,以及如何设置这些参数,最后就只能胡乱设置,甚至压根儿不设置.资源参数设置的不合理,可能会导致没有充分利用集群资源,作业运行会极其缓慢:或者设置的资源过大,队列没有足够的资源来提供,进而导致各种异常.总之,无论是哪种情况,都会导致Spark作业的运行效率低下,甚至根本无法运行.因此我们必须对Spark作业的资源使

【转载】前台页面优化全攻略-系列博文

据调查,网页大小在2013年平均增长了32%,平均达到了1.7M,单独的HTTP请求达到96个.这是令人震惊的数字,而且这只是个平均值,有一半的网站会大于这个值.网站也得了肥胖症,而我们这些开发者就是罪魁祸首. 本文转载博客园系列博文,博文如下: 前台页面优化全攻略(一) 前台页面优化全攻略(二) 前台页面优化全攻略(三) 前台页面优化全攻略(四)

快速排序分区以及优化方法

一.快速排序扫描分区法 通过单向扫描,双向扫描,以及三指针分区法分别实现快速排序算法.着重理解分区的思想. 单向扫描分区法 思路:用两个指针将数组划分为三个区间,扫描指针(scan_pos)左边是确认小于等于主元的,扫描指针到某个指针(next_bigger_pos)中间为未知的,因此我们将第二个指针(next_bigger_pos)称为未知区间指针,末指针的右边区间为确认大于主元的元素.主元就是具体的划分数组的元素,主元的选择有讲究,这里选择数组的首元素 代码: import java.uti

postgresql分区(引用)

1 建立大表.   2 创建分区继承   3 定义Rule或者Trigger? 1 建立大表        CREATE TABLE student (student_id bigserial, name varchar(32), score smallint) 2 创建分区继承 CREATE TABLE student_qualified (CHECK (score >= 60 )) INHERITS (student) ; CREATE TABLE student_nqualified (C

(转载)postgresql navicat 客户端连接验证失败解决方法:password authentication failed for user

命令:su - postgres CREATE USER foo WITH PASSWORD 'secret'; ==================== 1.2个配置修改 postgresql.conf:修改 listen_addresses = '*' pg_hba.conf:增加 # IPv4 local connections:host    all             all             127.0.0.1/32            md5host    all  

[转载]网站前端性能优化之javascript和css——网站性能优化

之前看过Yahoo团队写的一篇关于网站性能优化的文章,文章是2010年左右写的,虽然有点老,但是很多方面还是很有借鉴意义的.关于css的性能优化,他提到了如下几点: CSS性能优化 1.把样式表置于顶部 现把样式表放到文档的< head />内部似乎会加快页面的下载速度.这是因为把样式表放到< head />内会使页面有步骤的加载显示. 注重性能的前端服务器往往希望页面有秩序地加载.同时,我们也希望浏览器把已经接收到内容尽可能显示出来.这对于拥有较多内容的页面和网速较慢的用户来说特