Postgresql11分区索引

tech2023-02-11  101

pg11分区索引使用总结

在 PostgreSQL 10 中,分区上的索引需要基于各个分区手动创建,而不能基于分区的父表创建索引。PostgreSQL 11 可以基于分区表创建索引。分区表上的索引并不会创建一个物理上的索引,而是为每个分区上的索引创建一个模板.

分区自动索引

如果在分区表上创建了一个索引,PostgreSQL自动为每个分区创建具有相同属性的索引,其主表本身并没有自身索引。 自动创建的索引,名称按照 “{partition name}_{column name}idx” 的模式定义。多个字段的复合索引使用下划线()连接字段名称。如果索引名称已经存在,在名称的最后添加一个数字。如果名称过长,使用缩写。

CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (logdate); --建子分区measurement_y2018 CREATE TABLE measurement_y2018 PARTITION OF measurement FOR VALUES FROM ('2018-01-01') TO ('2019-01-01'); --建子分区measurement_y2019 CREATE TABLE measurement_y2019 PARTITION OF measurement FOR VALUES FROM ('2019-01-01') TO ('2020-01-01'); --在主表非分区列建索引 CREATE INDEX idx_measurement_peaktemp ON measurement(peaktemp); CREATE INDEX idx_measurement_peaktemp_city_id ON measurement(city_id); --在主表分区表分区列建索引 CREATE INDEX idx_measurement_peaktemp_1 ON measurement(logdate,peaktemp); --在子分区表分区列建索引 CREATE INDEX idx_measurement_peaktemp_y2018 ON measurement_y2018(logdate); --在子分区表非分区列建索引 CREATE INDEX idx_measurement_peaktemp_y2018_2 ON measurement_y2018(peaktemp); --在子分区表非分区列建复合索引 CREATE INDEX idx_measurement_peaktemp_y2018_1 ON measurement_y2018(logdate,peaktemp);

查询数据库索引如下:

postgres=# select * from pg_indexes where tablename like '%measurement%' order by tablename; schemaname | tablename | indexname | tablespace | indexdef ------------+-------------------+----------------------------------------+------------+---------------------------------------------------------------- ------------------------------------------------- public | measurement_y2018 | idx_measurement_peaktemp_y2018 | | CREATE INDEX idx_measurement_peaktemp_y2018 ON public.measureme nt_y2018 USING btree (logdate) public | measurement_y2018 | idx_measurement_peaktemp_y2018_2 | | CREATE INDEX idx_measurement_peaktemp_y2018_2 ON public.measure ment_y2018 USING btree (peaktemp) public | measurement_y2018 | idx_measurement_peaktemp_y2018_1 | | CREATE INDEX idx_measurement_peaktemp_y2018_1 ON public.measure ment_y2018 USING btree (logdate, peaktemp) public | measurement_y2018 | measurement_y2018_logdate_peaktemp_idx | | CREATE INDEX measurement_y2018_logdate_peaktemp_idx ON public.m easurement_y2018 USING btree (logdate, peaktemp) public | measurement_y2018 | measurement_y2018_peaktemp_idx | | CREATE INDEX measurement_y2018_peaktemp_idx ON public.measureme nt_y2018 USING btree (peaktemp) public | measurement_y2019 | measurement_y2019_logdate_peaktemp_idx | | CREATE INDEX measurement_y2019_logdate_peaktemp_idx ON public.m easurement_y2019 USING btree (logdate, peaktemp) public | measurement_y2019 | measurement_y2019_peaktemp_idx | | CREATE INDEX measurement_y2019_peaktemp_idx ON public.measureme nt_y2019 USING btree (peaktemp) (7 rows)

以_idx结尾的为由主表为分区表自动创建的索引

随后新增的分区或者通过 ATTACH PARTITION 挂载的分区都会自动创建相应的索引。

CREATE TABLE measurement_y2020 (LIKE measurement); ALTER TABLE measurement ATTACH PARTITION measurement_y2020 FOR VALUES FROM ('2020-01-01') TO ('2021-01-01'); postgres=# \d measurement_y2020 Table "public.measurement_y2020" Column | Type | Collation | Nullable | Default -----------+---------+-----------+----------+--------- city_id | integer | | not null | logdate | date | | not null | peaktemp | integer | | | unitsales | integer | | | Partition of: measurement FOR VALUES FROM ('2020-01-01') TO ('2021-01-01') Indexes: "measurement_y2020_logdate_peaktemp_idx" btree (logdate, peaktemp) "measurement_y2020_peaktemp_idx" btree (peaktemp)

自动创建的索引不能单独删除,可以通过分区表统一删除。

postgres=# DROP INDEX measurement_y2020_peaktemp_idx; ERROR: cannot drop index measurement_y2020_peaktemp_idx because index idx_measurement_peaktemp requires it HINT: You can drop index idx_measurement_peaktemp instead.

分区表唯一约束

对于 PostgreSQL 10,只能基于分区创建唯一约束(PRIMARY KEY 和 UNIQUE KEY),而不能针对分区的父表创建唯一约束。PostgreSQL 11 支持分区表上的唯一约束。

CREATE TABLE rtable(c1 INT, c2 VARCHAR(10)) PARTITION BY RANGE(c1); ALTER TABLE rtable ADD CONSTRAINT pk_rtable PRIMARY KEY(c1); \d rtable Table "public.rtable" Column | Type | Collation | Nullable | Default --------+-----------------------+-----------+----------+--------- c1 | integer | | not null | c2 | character varying(10) | | | Partition key: RANGE (c1) Indexes: "pk_rtable" PRIMARY KEY, btree (c1) Number of partitions: 0

添加分区或者加载(ATTACH)分区时自动创建相应的主键:

CREATE TABLE rtable100 PARTITION OF rtable FOR VALUES FROM (1) TO (100); \d rtable100 Table "public.rtable100" Column | Type | Collation | Nullable | Default --------+-----------------------+-----------+----------+--------- c1 | integer | | not null | c2 | character varying(10) | | | Partition of: rtable FOR VALUES FROM (1) TO (100) Indexes: "rtable100_pkey" PRIMARY KEY, btree (c1)

如果在分区表上创建了唯一约束,无法再创建基于外部表(FOREIGN TABLE)的分区。因为无法为外部表创建唯一约束。

CREATE FOREIGN TABLE rtable200 PARTITION OF rtable FOR VALUES FROM (101) TO (200) SERVER remote1; ERROR: cannot create index on foreign table "rtable200"

主键约束或唯一约束必须包含分区字段。这样才能确保整个分区表内的唯一性,因为每个分区上的唯一约束只维护自身的唯一性。

CREATE TABLE rtable1(c1 INT, c2 VARCHAR(10)) PARTITION BY RANGE(c1); ALTER TABLE rtable1 ADD CONSTRAINT pk_table1 PRIMARY KEY(c2); ERROR: insufficient columns in PRIMARY KEY constraint definition DETAIL: PRIMARY KEY constraint on table "rtable1" lacks column "c1" which is part of the partition key.

分区表+索引查询分析

查询条件无分区键,全表扫描,各个子分区都会查询一遍

postgres=# explain select * from measurement where city_id =1; QUERY PLAN -------------------------------------------------------------------------------------------------- Append (cost=4.22..44.43 rows=27 width=16) -> Bitmap Heap Scan on measurement_y2018 (cost=4.22..14.76 rows=9 width=16) Recheck Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2018_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) -> Bitmap Heap Scan on measurement_y2019 (cost=4.22..14.76 rows=9 width=16) Recheck Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2019_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) -> Bitmap Heap Scan on measurement_y2020 (cost=4.22..14.76 rows=9 width=16) Recheck Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2020_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) (13 rows)

查询条件包括分区键(logdate),先定位具体的分区子表,再在子表进行索引查询

postgres=# explain select * from measurement where city_id =1 and logdate ='2019-03-01'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Append (cost=8.69..12.71 rows=1 width=16) -> Bitmap Heap Scan on measurement_y2019 (cost=8.69..12.71 rows=1 width=16) Recheck Cond: ((city_id = 1) AND (logdate = '2019-03-01'::date)) -> BitmapAnd (cost=8.69..8.69 rows=1 width=0) -> Bitmap Index Scan on measurement_y2019_city_id_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (city_id = 1) -> Bitmap Index Scan on measurement_y2019_logdate_peaktemp_idx (cost=0.00..4.22 rows=9 width=0) Index Cond: (logdate = '2019-03-01'::date)
最新回复(0)