Postgresql11分区索引

tech2023-02-11 161

pg11分区索引使用总结

在 PostgreSQL 10 中，分区上的索引需要基于各个分区手动创建，而不能基于分区的父表创建索引。PostgreSQL 11 可以基于分区表创建索引。分区表上的索引并不会创建一个物理上的索引，而是为每个分区上的索引创建一个模板.

分区自动索引

如果在分区表上创建了一个索引，PostgreSQL自动为每个分区创建具有相同属性的索引，其主表本身并没有自身索引。自动创建的索引，名称按照 “{partition name}_{column name}idx” 的模式定义。多个字段的复合索引使用下划线（）连接字段名称。如果索引名称已经存在，在名称的最后添加一个数字。如果名称过长，使用缩写。

CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (logdate); --建子分区measurement_y2018 CREATE TABLE measurement_y2018 PARTITION OF measurement FOR VALUES FROM ('2018-01-01') TO ('2019-01-01'); --建子分区measurement_y2019 CREATE TABLE measurement_y2019 PARTITION OF measurement FOR VALUES FROM ('2019-01-01') TO ('2020-01-01'); --在主表非分区列建索引 CREATE INDEX idx_measurement_peaktemp ON measurement(peaktemp); CREATE INDEX idx_measurement_peaktemp_city_id ON measurement(city_id); --在主表分区表分区列建索引 CREATE INDEX idx_measurement_peaktemp_1 ON measurement(logdate,peaktemp); --在子分区表分区列建索引 CREATE INDEX idx_measurement_peaktemp_y2018 ON measurement_y2018(logdate); --在子分区表非分区列建索引 CREATE INDEX idx_measurement_peaktemp_y2018_2 ON measurement_y2018(peaktemp); --在子分区表非分区列建复合索引 CREATE INDEX idx_measurement_peaktemp_y2018_1 ON measurement_y2018(logdate,peaktemp);

查询数据库索引如下：

postgres=# select * from pg_indexes where tablename like '%measurement%' order by tablename; schemaname | tablename | indexname | tablespace | indexdef ------------+-------------------+----------------------------------------+------------+---------------------------------------------------------------- ------------------------------------------------- public | measurement_y2018 | idx_measurement_peaktemp_y2018 | | CREATE INDEX idx_measurement_peaktemp_y2018 ON public.measureme nt_y2018 USING btree (logdate) public | measurement_y2018 | idx_measurement_peaktemp_y2018_2 | | CREATE INDEX idx_measurement_peaktemp_y2018_2 ON public.measure ment_y2018 USING btree (peaktemp) public | measurement_y2018 | idx_measurement_peaktemp_y2018_1 | | CREATE INDEX idx_measurement_peaktemp_y2018_1 ON public.measure ment_y2018 USING btree (logdate, peaktemp) public | measurement_y2018 | measurement_y2018_logdate_peaktemp_idx | | CREATE INDEX measurement_y2018_logdate_peaktemp_idx ON public.m easurement_y2018 USING btree (logdate, peaktemp) public | measurement_y2018 | measurement_y2018_peaktemp_idx | | CREATE INDEX measurement_y2018_peaktemp_idx ON public.measureme nt_y2018 USING btree (peaktemp) public | measurement_y2019 | measurement_y2019_logdate_peaktemp_idx | | CREATE INDEX measurement_y2019_logdate_peaktemp_idx ON public.m easurement_y2019 USING btree (logdate, peaktemp) public | measurement_y2019 | measurement_y2019_peaktemp_idx | | CREATE INDEX measurement_y2019_peaktemp_idx ON public.measureme nt_y2019 USING btree (peaktemp) (7 rows)

以_idx结尾的为由主表为分区表自动创建的索引

随后新增的分区或者通过 ATTACH PARTITION 挂载的分区都会自动创建相应的索引。

自动创建的索引不能单独删除，可以通过分区表统一删除。

postgres=# DROP INDEX measurement_y2020_peaktemp_idx; ERROR: cannot drop index measurement_y2020_peaktemp_idx because index idx_measurement_peaktemp requires it HINT: You can drop index idx_measurement_peaktemp instead.

分区表唯一约束

对于 PostgreSQL 10，只能基于分区创建唯一约束（PRIMARY KEY 和 UNIQUE KEY），而不能针对分区的父表创建唯一约束。PostgreSQL 11 支持分区表上的唯一约束。

添加分区或者加载（ATTACH）分区时自动创建相应的主键：

如果在分区表上创建了唯一约束，无法再创建基于外部表（FOREIGN TABLE）的分区。因为无法为外部表创建唯一约束。

CREATE FOREIGN TABLE rtable200 PARTITION OF rtable FOR VALUES FROM (101) TO (200) SERVER remote1; ERROR: cannot create index on foreign table "rtable200"

主键约束或唯一约束必须包含分区字段。这样才能确保整个分区表内的唯一性，因为每个分区上的唯一约束只维护自身的唯一性。

CREATE TABLE rtable1(c1 INT, c2 VARCHAR(10)) PARTITION BY RANGE(c1); ALTER TABLE rtable1 ADD CONSTRAINT pk_table1 PRIMARY KEY(c2); ERROR: insufficient columns in PRIMARY KEY constraint definition DETAIL: PRIMARY KEY constraint on table "rtable1" lacks column "c1" which is part of the partition key.

分区表+索引查询分析

查询条件无分区键，全表扫描,各个子分区都会查询一遍

postgres=# explain select * from measurement where city_id =1; QUERY PLAN -------------------------------------------------------------------------------------------------- Append (cost=4.22..44.43 rows=27 width=16) -> Bitmap Heap Scan on measurement_y2018 (cost=4.22..14.76 rows=9 width=16) Recheck Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2018_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) -> Bitmap Heap Scan on measurement_y2019 (cost=4.22..14.76 rows=9 width=16) Recheck Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2019_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) -> Bitmap Heap Scan on measurement_y2020 (cost=4.22..14.76 rows=9 width=16) Recheck Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2020_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) (13 rows)

查询条件包括分区键(logdate)，先定位具体的分区子表，再在子表进行索引查询

postgres=# explain select * from measurement where city_id =1 and logdate ='2019-03-01'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Append (cost=8.69..12.71 rows=1 width=16) -> Bitmap Heap Scan on measurement_y2019 (cost=8.69..12.71 rows=1 width=16) Recheck Cond: ((city_id = 1) AND (logdate = '2019-03-01'::date)) -> BitmapAnd (cost=8.69..8.69 rows=1 width=0) -> Bitmap Index Scan on measurement_y2019_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2019_logdate_peaktemp_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (logdate = '2019-03-01'::date)

最新回复(0)