千家信息网

PostgreSQL DBA(60) - 列式存储zedstore

发表于:2024-11-30 作者:千家信息网编辑
千家信息网最后更新 2024年11月30日,本节简单介绍了Greenplum开源的列式存储zedstore的安装和使用.安装从Github上下载源码,与普通PG一样,编译安装即可[root@localhost postgres-zedstore
千家信息网最后更新 2024年11月30日PostgreSQL DBA(60) - 列式存储zedstore

本节简单介绍了Greenplum开源的列式存储zedstore的安装和使用.

安装
从Github上下载源码,与普通PG一样,编译安装即可

[root@localhost postgres-zedstore]# ./configure --enable-debug --with-python --with-perl --with-tcl --with-gssapi --with-pam --with-openssl --with-pam --without-ldap --with-libxml --with-libxslt --enable-dtrace --enable-depend --enable-cassert --with-systemd CFLAGS="-O0 -DOPTIMIZER_DEBUG -g3 -gdwarf-2" --prefix=/appdb/zedstorechecking build system type... x86_64-pc-linux-gnu...[root@localhost postgres-zedstore]# make -j4...[root@localhost postgres-zedstore]# make install...PostgreSQL installation complete.

Heap vs ZedStore
创建用户,初始化数据库

[zedstore@localhost ~]$ initdb -E utf8 -D /data/zedstore/testdbThe files belonging to this database system will be owned by user "zedstore".This user must also own the server process.The database cluster will be initialized with locale "en_US.UTF-8".The default text search configuration will be set to "english".Data page checksums are disabled.creating directory /data/zedstore/testdb ... okcreating subdirectories ... okselecting dynamic shared memory implementation ... posixselecting default max_connections ... 100selecting default shared_buffers ... 128MBselecting default timezone ... PRCcreating configuration files ... okrunning bootstrap script ... okperforming post-bootstrap initialization ... oksyncing data to disk ... okinitdb: warning: enabling "trust" authentication for local connectionsYou can change this by editing pg_hba.conf or using the option -A, or--auth-local and --auth-host, the next time you run initdb.Success. You can now start the database server using:    pg_ctl -D /data/zedstore/testdb -l logfile start

下面来对比一下head am和zedstore的性能差异
PG

testdb=# create table t_olap(id int,c1 int,c2 varchar(20));CREATE TABLE                           testdb=# insert into t_olap select x,x,'c2'||x from generate_series(1,5000000) as x;INSERT 0 5000000testdb=#

执行查询

testdb=# explain analyze select avg(id),sum(c1),max(c2) from t_olap;                                                                  QUERY PLAN                                                 ------------------------------------------------------------------------------------------------- Finalize Aggregate  (cost=69209.40..69209.40 rows=1 width=72) (actual time=925.540..925.540 rows=1 loops=1)   ->  Gather  (cost=69209.17..69209.38 rows=2 width=72) (actual time=925.284..932.688 rows=3 loops=1)         Workers Planned: 2         Workers Launched: 2         ->  Partial Aggregate  (cost=68209.17..68209.18 rows=1 width=72) (actual time=911.539..911.539 rows=1 loops=3)               ->  Parallel Seq Scan on t_olap  (cost=0.00..52584.24 rows=2083324 width=17) (actual time=0.037..240.287 rows=1666667 loops=3) Planning Time: 22.703 ms Execution Time: 933.020 ms(8 rows)

执行时间为933ms

ZedStore

[zedstore@localhost testdb]$ psql -d testdbpsql (13devel)Type "help" for help.testdb=# \dA+                                List of access methods   Name   | Type  |         Handler          |              Description               ----------+-------+--------------------------+---------------------------------------- brin     | Index | brinhandler              | block range index (BRIN) access method btree    | Index | bthandler                | b-tree index access method gin      | Index | ginhandler               | GIN index access method gist     | Index | gisthandler              | GiST index access method hash     | Index | hashhandler              | hash index access method heap     | Table | heap_tableam_handler     | heap table access method spgist   | Index | spghandler               | SP-GiST index access method zedstore | Table | zedstore_tableam_handler | zedstore table access method(8 rows)testdb=# create table t_olap(id int,c1 int,c2 varchar(20)) using zedstore;CREATE TABLEtestdb=# \d+ t_olap                                          Table "public.t_olap" Column |         Type          | Collation | Nullable | Default | Storage  | Stats target | Description --------+-----------------------+-----------+----------+---------+----------+--------------+------------- id     | integer               |           |          |         | plain    |              |  c1     | integer               |           |          |         | plain    |              |  c2     | character varying(20) |           |          |         | extended |              | Access method: zedstoretestdb=# insert into t_olap select x,x,'c2'||x from generate_series(1,5000000) as x;INSERT 0 5000000

执行查询

testdb=# explain analyze select avg(id),sum(c1),max(c2) from t_olap;                                                                  QUERY PLAN                                                 ------------------------------------------------------------------------------------------------- Finalize Aggregate  (cost=31425.10..31425.11 rows=1 width=72) (actual time=1707.755..1707.756 rows=1 loops=1)   ->  Gather  (cost=31424.87..31425.08 rows=2 width=72) (actual time=1659.121..1710.512 rows=3 loops=1)         Workers Planned: 2         Workers Launched: 2         ->  Partial Aggregate  (cost=30424.87..30424.88 rows=1 width=72) (actual time=1647.216..1647.217 rows=1 loops=3)               ->  Parallel Seq Scan on t_olap  (cost=0.00..24130.07 rows=839307 width=17) (actual time=0.418..1124.465 rows=1666667 loops=3) Planning Time: 1.907 ms Execution Time: 1753.191 ms(8 rows)

执行时间为1753ms,列式存储似乎没有发挥作用?待续.

参考资料
PostgreSQL 基于access method api的列存zedstore

0