Postgres-XL 集群详细创建步骤
最近公司业务需求,需要使用Postgres-XL 集群,关于这部分知识,网络资料不多。经过一段时间的查询,和各种弯路之后,终于完成安装。将详细步骤完整记录,以备查阅。也希望能帮到需要的人。
下面就开始吧:
主机列表和集群安装的角色分配
10.21.13.109 GTM10.21.13.67 coordinator&datanode10.21.13.60 datanode
2.创建postgres用户,这部分我使用ansible完成的用户创建,以及相关软件包的应用,节省劳动力(yum其实可以使用ansible自带的方式,所以我ansible写的比较业余)
ansible all -m shell -a "useradd postgres"ansible all -m shell -a 'echo "postgres:postgres" | chpasswd'ansible all -m shell -a "yum install -y flex bison readline-devel zlib-devel openjade docbook-style-dsssl gcc"
3. 在每个节点上下载软件,并解压缩
ansible all -m shell -a "wget https://www.postgres-xl.org/downloads/postgres-xl-9.5r1.6.tar.bz2"ansible all -m shell -a "mv /root/postgres-xl-9.5r1.6.tar.bz2 /tmp; chown postgres:postgres /tmp/postgres-xl-9.5r1.6.tar.bz2"ansible all -m shell -a "tar -jxvf /tmp/postgres-xl-9.5r1.6.tar.bz2 -C /home/postgres"
4. 全部节点都创建pgxc路径,用来存放postgres文件
ansible all -m shell -a "su - postgres -c 'mkdir -p /home/postgres/pgxc/nodes/'"
5. 修改每个节点的.bashrc文件,注意:修改/etc/profile是不生效的。
su - postgresvi .bashrc
添加如下内容:
export PGHOME=/home/postgres/pgxlexport PGUSER=postgresexport LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATHexport PATH=$PGHOME/bin:$PATH
使之生效:
source ~/.bashrc
6. 关闭防火墙或者放开相应的端口,这里我直接关闭了每个虚拟机的防火墙,并且重启它们:
service iptables stopchkconfig iptables off --重启后生效
7. 关闭SELinux:
查看SELinux状态:
/usr/sbin/sestatus -v ##如果SELinux status参数为enabled即为开启状态SELinux status: enabled
临时关闭(不用重启机器):
setenforce 0 ##设置SELinux 成为permissive模式 ##setenforce 1 设置SELinux 成为enforcing模式
永久关闭:
修改/etc/selinux/config 文件将SELINUX=enforcing改为SELINUX=disabled
8. 到每一个节点执行下面的软件安装步骤,这个过程注意,如果中间有报错,需要解决到。例如确实gcc包之类的,不然后面的编译也不会成功的。
cd postgres-xl./configure --prefix=/home/postgres/pgxl/makemake installcd contrib/makemake instal
9. 执行pgxc_ctl命令来生成配置集群的模板文件:
./pgxc_ctl ---会提示Error说没有配置文件,忽略即可
PGXC prepare ---执行该命令将会生成一份配置文件模板
10. 根据模板修改对应的主机名,端口等信息,内容如下:
vi ~/pgxc_ctl/pgxc_ctl.confpgxcInstallDir=$HOME/pgxcpgxcOwner=$USERpgxcUser=$pgxcOwnertmpDir=/tmplocalTmpDir=$tmpDirconfigBackup=nconfigBackupHost=pgxc-linkerconfigBackupDir=$HOME/pgxcconfigBackupFile=pgxc_ctl.bakgtmName=gtmgtmMasterServer=10.21.13.109gtmMasterPort=20001gtmMasterDir=$HOME/pgxc/nodes/gtmgtmExtraConfig=nonegtmMasterSpecificExtraConfig=nonegtmSlave=ngtmProxyDir=$HOME/pgxc/nodes/gtm_pxygtmProxy=ygtmProxyNames=(gtm_pxy1 gtm_pxy2)gtmProxyServers=(10.21.13.67 10.21.13.60)gtmProxyPorts=(20001 20001)gtmProxyDirs=($gtmProxyDir $gtmProxyDir)gtmPxyExtraConfig=nonegtmPxySpecificExtraConfig=(none none)coordMasterDir=$HOME/pgxc/nodes/coordcoordSlaveDir=$HOME/pgxc/nodes/coord_slavecoordArchLogDir=$HOME/pgxc/nodes/coord_archlogcoordNames=(coord1)coordPorts=(20004)poolerPorts=(20010)coordPgHbaEntries=(10.21.13.0/25) <<<<<<<<<<<<<<<<<<<<<<<<<非常重要coordMasterServers=(10.21.13.67)coordMasterDirs=($coordMasterDir)coordMaxWALsernder=5coordMaxWALSenders=($coordMaxWALsernder)coordSlave=n <<<<<<<<<<<<<<<<没使用slavecoordSlaveSync=ycoordSlaveServers=(10.21.13.60 10.21.13.67)coordSlavePorts=(20004 20005 20004 20005)coordSlavePoolerPorts=(20010 20011 20010 20011)coordSlaveDirs=($coordSlaveDir $coordSlaveDir)coordArchLogDirs=($coordArchLogDir $coordArchLogDir)coordExtraConfig=coordExtraConfigcat > $coordExtraConfig <11. 初始话命令,如果前面没有任何错误,这里就是最后一步了,PGXL会把全部的部署工作完成的。
pgxc_ctl -c pgxc_ctl.conf init all
完整输出如下:
[postgres@ip-10-21-13-109 pgxc_ctl]$ pgxc_ctl -c pgxc_ctl.conf init all /bin/bashInstalling pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.Reading configuration using /home/postgres/pgxc_ctl/pgxc_ctl_bash --home /home/postgres/pgxc_ctl --configuration /home/postgres/pgxc_ctl/pgxc_ctl.confFinished reading configuration. ******** PGXC_CTL START ***************Current directory: /home/postgres/pgxc_ctlInitialize GTM masterERROR: target directory (/home/postgres/pgxc/nodes/gtm) exists and not empty. Skip GTM initilialization1:3768231680:2017-11-09 19:35:02.983 CST -FATAL: lock file "gtm.pid" already exists2:3768231680:2017-11-09 19:35:02.983 CST -HINT: Is another GTM (PID 14635) running in data directory "/home/postgres/pgxc/nodes/gtm"?LOCATION: CreateLockFile, main.c:2099waiting for server to shut down... doneserver stoppedDone.Start GTM masterserver startingInitialize all the gtm proxies.Initializing gtm proxy gtm_pxy1.Initializing gtm proxy gtm_pxy2.waiting for server to shut down... doneserver stoppedThe files belonging to this GTM system will be owned by user "postgres".This user must also own the server process.fixing permissions on existing directory /home/postgres/pgxc/nodes/gtm_pxy ... okcreating configuration files ... okSuccess.waiting for server to shut down... doneserver stoppedThe files belonging to this GTM system will be owned by user "postgres".This user must also own the server process.fixing permissions on existing directory /home/postgres/pgxc/nodes/gtm_pxy ... okcreating configuration files ... okSuccess.Done.Starting all the gtm proxies.Starting gtm proxy gtm_pxy1.Starting gtm proxy gtm_pxy2.server startingserver startingDone.Initialize all the coordinator masters.Initialize coordinator master coord1.ERROR: target coordinator master coord1 is running now. Skip initilialization.Done.Starting coordinator master.Starting coordinator master coord1ERROR: target coordinator master coord1 is already running now. Skip initialization.Done.Initialize all the datanode masters.Initialize the datanode master datanode1.Initialize the datanode master datanode2.The files belonging to this database system will be owned by user "postgres".This user must also own the server process.The database cluster will be initialized with locale "en_US.UTF-8".The default database encoding has accordingly been set to "UTF8".The default text search configuration will be set to "english".Data page checksums are disabled.fixing permissions on existing directory /home/postgres/pgxc/nodes/dn_master ... okcreating subdirectories ... okselecting default max_connections ... 100selecting default shared_buffers ... 128MBselecting dynamic shared memory implementation ... posixcreating configuration files ... okcreating template1 database in /home/postgres/pgxc/nodes/dn_master/base/1 ... okinitializing pg_authid ... okinitializing dependencies ... okcreating system views ... okcreating cluster information ... okloading system objects' descriptions ... okcreating collations ... okcreating conversions ... okcreating dictionaries ... oksetting privileges on built-in objects ... okcreating information schema ... okloading PL/pgSQL server-side language ... okvacuuming database template1 ... okcopying template1 to template0 ... okcopying template1 to postgres ... oksyncing data to disk ... okfreezing database template0 ... okfreezing database template1 ... okfreezing database postgres ... okWARNING: enabling "trust" authentication for local connectionsYou can change this by editing pg_hba.conf or using the option -A, or--auth-local and --auth-host, the next time you run initdb.Success.The files belonging to this database system will be owned by user "postgres".This user must also own the server process.The database cluster will be initialized with locale "en_US.UTF-8".The default database encoding has accordingly been set to "UTF8".The default text search configuration will be set to "english".Data page checksums are disabled.fixing permissions on existing directory /home/postgres/pgxc/nodes/dn_master ... okcreating subdirectories ... okselecting default max_connections ... 100selecting default shared_buffers ... 128MBselecting dynamic shared memory implementation ... posixcreating configuration files ... okcreating template1 database in /home/postgres/pgxc/nodes/dn_master/base/1 ... okinitializing pg_authid ... okinitializing dependencies ... okcreating system views ... okcreating cluster information ... okloading system objects' descriptions ... okcreating collations ... okcreating conversions ... okcreating dictionaries ... oksetting privileges on built-in objects ... okcreating information schema ... okloading PL/pgSQL server-side language ... okvacuuming database template1 ... okcopying template1 to template0 ... okcopying template1 to postgres ... oksyncing data to disk ... okfreezing database template0 ... okfreezing database template1 ... okfreezing database postgres ... okWARNING: enabling "trust" authentication for local connectionsYou can change this by editing pg_hba.conf or using the option -A, or--auth-local and --auth-host, the next time you run initdb.Success.Done.Starting all the datanode masters.Starting datanode master datanode1.Starting datanode master datanode2.LOG: redirecting log output to logging collector processHINT: Future log output will appear in directory "pg_log".LOG: redirecting log output to logging collector processHINT: Future log output will appear in directory "pg_log".Done.ALTER NODE coord1 WITH (HOST='10.21.13.67', PORT=20004);ALTER NODECREATE NODE datanode1 WITH (TYPE='datanode', HOST='10.21.13.67', PORT=20008, PRIMARY, PREFERRED);ERROR: PGXC Node datanode1: object already definedCREATE NODE datanode2 WITH (TYPE='datanode', HOST='10.21.13.60', PORT=20008);ERROR: PGXC Node datanode2: object already definedSELECT pgxc_pool_reload(); pgxc_pool_reload ------------------ t(1 row)Done.EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''10.21.13.67'', PORT=20004)';EXECUTE DIRECTEXECUTE DIRECT ON (datanode1) 'ALTER NODE datanode1 WITH (TYPE=''datanode'', HOST=''10.21.13.67'', PORT=20008, PRIMARY, PREFERRED)';EXECUTE DIRECTEXECUTE DIRECT ON (datanode1) 'CREATE NODE datanode2 WITH (TYPE=''datanode'', HOST=''10.21.13.60'', PORT=20008)';EXECUTE DIRECTEXECUTE DIRECT ON (datanode1) 'SELECT pgxc_pool_reload()'; pgxc_pool_reload ------------------ t(1 row)EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''10.21.13.67'', PORT=20004)';EXECUTE DIRECTEXECUTE DIRECT ON (datanode2) 'CREATE NODE datanode1 WITH (TYPE=''datanode'', HOST=''10.21.13.67'', PORT=20008, PRIMARY, PREFERRED)';EXECUTE DIRECTEXECUTE DIRECT ON (datanode2) 'ALTER NODE datanode2 WITH (TYPE=''datanode'', HOST=''10.21.13.60'', PORT=20008)';EXECUTE DIRECTEXECUTE DIRECT ON (datanode2) 'SELECT pgxc_pool_reload()'; pgxc_pool_reload ------------------ t(1 row)Done.12. 验证可用性
在coordinator节点上,本地连接方式
psql -p 20004远程用户连接方式
psql -h 10.21.13.67 -p 20004 -U postgrespostgres=# select * from pgxc_node;node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id-----------+-----------+-----------+-------------+----------------+------------------+------------datanode1 | D | 20008 | 10.21.13.67 | t | t | 888802358datanode2 | D | 20008 | 10.21.13.60 | f | f | -905831925coord1 | C | 20004 | 10.21.13.67 | f | f | 188569664313. 建表测试
postgres=# create table test(id integer,name varchar(10));
postgres=# insert into test(id,name) values(1,'test');
postgres=# insert into test(id,name) values(2,'test');
查询结果
postgres=# select * from test;id | name----+------1 | test2 | test(2 rows)问题汇总:
1. 如果登陆到数据库中,执行下面语句,发现只有coord的信息,说明并没有创建成功,
postgres=# select * from pgxc_node;node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id-----------+-----------+-----------+-----------+----------------+------------------+------------coord3 | C | 20004 | localhost | f | f | 1638403545(1 row)而问题是出在coordPgHbaEntries=(10.21.13.0/25)和datanodePgHbaEntries=(10.21.13.0/25)的访问权限控制上,一定要和实际的子网掩码匹配,不然各个datanode节点和coord节点间是无法正常通讯的。
由于没有注意各个细节,让这个问题困扰我一周时间。一直在找为什么datanode之间不能相互识别的原因,其实就是很小的一个点。
[postgres@ip-10-21-13-109 pgxc_ctl]$ ifconfigeth0 Link encap:Ethernet HWaddr 02:57:E7:56:4C:70inet addr:10.21.13.109 Bcast:10.21.13.127 Mask:255.255.255.128<<<<<<10.21.13.0/25inet6 addr: fe80::57:e7ff:fe56:4c70/64 Scope:LinkUP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1RX packets:29500 errors:0 dropped:0 overruns:0 frame:0TX packets:28538 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:1000RX bytes:3113614 (2.9 MiB) TX bytes:4653853 (4.4 MiB)Interrupt:1602. 如果遇到建表,建库失败,也是遇到上面的访问控制权限问题。在部署完成之后,这个参数是在/home/postgres/pgxc/nodes/dn_master/pg_hba.conf和/home/postgres/pgxc/nodes/coord/pg_hba.conf中设置的。但是重新init,会覆盖掉这些文件的。
postgres=# create database test;ERROR: Failed to get pooled connectionsHINT: This may happen because one or more nodes are currently unreachable, either because of node or network failure.Its also possible that the target node may have hit the connection limit or the pooler is configured with low connections.Please check if all nodes are running fine and also review max_connections and max_pool_size configuration parameters3. 有很多人的安装方式,是最后通过手动执行下面命令,添加coord和datanode,其实前面的pgxc_ctl.conf配置正确的话,脚本都是自动帮你完成的,没有必要人工干预
create node coord1 with (type=coordinator,host='c6702', port=20004);create node coord2 with (type=coordinator,host='c6703', port=20005);alter node coord3 with (type=coordinator,host='c6704', port=20004);create node datanode1 with (type=datanode, host='c6702',port=20008);create node datanode2 with (type=datanode, host='c6703',port=20008);alter node datanode3 with (type=datanode, host='c6704',port=20008);4. 第8步,make PGXL软件过程中,一定要仔细查看输出,有错误一定要纠正。不然后续会发现,有很多命令软件,都没有安装成功。
5.如果你在远程登陆postgres sql遇到如下问题
[root@c6703 ~]# psql -h 10.21.13.67 -p 20004 -U postgrespsql: FATAL: no pg_hba.conf entry for host "172.16.32.116", user "postgres", database "postgres"修改/home/postgres/pgxc/nodes/dn_master/pg_hba.conf和/home/postgres/pgxc/nodes/coord/pg_hba.conf中的访问控制权允许对应的IP有访问权限,就可以登陆了。注意,要修改全部的coordinator和datanode的pg_hba.conf配置文件才行。然后通过postgres用户执行reload命令,使配置文件生效。或者重启使配置生效。
[root@c6703 ~]# psql -h 10.21.13.67 -p 20004 -U testpostgres=# SELECT pg_reload_conf();[root@c6703 ~]# psql -h 10.21.13.67 -p 20004 -U testPassword for user test: psql (PGXL 9.5r1.6, based on PG 9.5.8 (Postgres-XL 9.5r1.6))Type "help" for help.下面三篇是我主要参考的blog,帮助很大,但是每一个都有"坑"。所以,只有自己动手实验过,才会记住,你掉过和跳过的"坑"
http://blog.csdn.net/yeruby/article/details/48996027
https://www.cnblogs.com/songyuejie/archive/2015/08/23/4752124.html
http://blog.csdn.net/freeland1/article/details/52346669?ref=myread
附件是第10和11步的文本文件,可以下载修改并使用。
附件:http://down.51cto.com/data/2368732