ORACLE11GR2 RAC文件系统变更成ASM EXTEND RAC及高可用测试
本来一直都有玩下ASM EXTEND RAC这样的想法,苦于没有资源测试,等。。。。。
老天不负有心人啊~哈哈!终于有资源玩了。
2套存储:EMS跟HDS,分别放在不同的机房。
由于原测试系统用的是文件系统,故要将其先改为ASM,再创建ASM EXTEND RAC。
此次修改成ASM EXTEND RAC遇到一系列问题,虽然解决这些问题有过苦恼,但EXTEND RAC成功完成之后,有种莫名的成就感,
各位看官大问题解决之后有木有同感~ 呵呵
1.系统环境
1.1 OS及DB版本
主机OS版本:AIX 7.1 ("7100-02-03-1334")
ORACLE版本:oracle 11.2.0.3 PSU10
是否RAC:是
节点个数:4个
存储:HDS 100G,EMS 50G
ASM或文件系统:赛门铁克VERITAS卷管理工具搭建集群文件系统
1.2 硬件
RAM : 128
SWAP: 13G
1.3 AIX /TMP文件系统
8GB
1.4 AIX JDK & JRE
IBM JDK 1.6.0.00 (64 BIT)
1.5 目录详细
/oracle 50GB
/oraclelog 30GB
/ocrvote 2G
/archivelog 400G
/oradata 850
1.6 主机IP配置信息
100.15.64.180 testdb1
100.15.64.181 testdb2
100.15.64.182 testdb3
100.15.64.183 testdb4
100.15.64.184 testdb1-vip
100.15.64.185 testdb2-vip
100.15.64.186 testdb3-vip
100.15.64.187 testdb4-vip
100.15.64.188 testdb-scan
7.154.64.1 testdb1-priv
7.154.64.2 testdb2-priv
7.154.64.3 testdb3-priv
7.154.64.4 testdb4-priv
2.文件系统更换成ASM
2.1磁盘权限及属性修改
chown grid:asmadmin /dev/vx/rdmp/remc0_04a1
chown grid:asmadmin /dev/vx/rdmp/rhitachi_v0_11cd
chmod 660 /dev/vx/rdmp/remc0_04a1
chmod 660 /dev/vx/rdmp/rhitachi_v0_11cd
(注:由于测试库使用的是赛门铁克的存储多路径软件,故无需修改磁盘属性)
2.2创建ASM实例
su - grid
export DISPLAY=100.15.70.169:0.0
asmca
(注:创建OCTVOTE磁盘组选NORMAL冗余,创建2个故障组,最少3块磁盘,建议选用3块磁盘,当asm的故障组如果有多余3块盘,votedisk迁移到这个磁盘组也只用其中的3块盘。使用crsctl query css votedisk只看到votedisk放在3块盘上。磁盘组的可用空间以其故障组总大小最小的为准)
2.3创建ASM磁盘组SYSDG,DATADG并修改磁盘组参数
su - grid
export DISPLAY=100.15.70.169:0.0
asmca
注:同一边的存储放在一个故障组中。
oracle 11G之后的ASM需要将rdbms的compatible参数修改为11.2.0.0,这个参数默认的是10.2.0.0,如果这个参数不修改,后面如果使用两个故障组,其中一个故障组故障修复后,将故障组在线online的时候会报如下错:
ORA-15283: ASM operation requires compatible.rdbms of 11.1.0.0.0 or higher
修改命令:
alter diskgroup SYSDG set attribute 'compatible.rdbms'='11.2.0.0';
select name,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup;
----compatibility对应asm的版本,
DATABASE_COMPATIBILITY --- 兼容数据库版本
2.4将文件系统数据文件迁移至ASM中
因为本次测试没建库,所以不涉及数据文件迁移,如需迁移,使用RMAN实现。
2.5将OCR,VOTEDISK迁移至磁盘组OCRVOTE中
1)查看ocr跟votedisk
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3296
Available space (kbytes) : 258824
ID : 1187520997
Device/File Name : /ocrvote/ocr1
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE a948649dc0e14f65bf171ba2ca496962 (/ocrvote/votedisk1) []
2. ONLINE a5f290d560684f47bf82eb3d34db5fc7 (/ocrvote/votedisk2) []
3. ONLINE 49617fb984fc4fcdbf5b7566a9e1778f (/ocrvote/votedisk3) []
Located 3 voting disk(s).
2)查看资源状态
$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.LISTENER.lsnr
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.OCRVOTE.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.SYSDG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.asm
ONLINE ONLINE testdb1 Started
ONLINE ONLINE testdb2 Started
ONLINE ONLINE testdb3 Started
ONLINE ONLINE testdb4 Started
ora.gsd
OFFLINE OFFLINE testdb1
OFFLINE OFFLINE testdb2
OFFLINE OFFLINE testdb3
OFFLINE OFFLINE testdb4
ora.net1.network
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.ons
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.registry.acfs
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testdb1
ora.cvu
1 ONLINE ONLINE testdb1
ora.oc4j
1 ONLINE ONLINE testdb1
ora.scan1.vip
1 ONLINE ONLINE testdb1
ora.testdb1.vip
1 ONLINE ONLINE testdb1
ora.testdb2.vip
1 ONLINE ONLINE testdb2
ora.testdb3.vip
1 ONLINE ONLINE testdb3
ora.testdb4.vip
1 ONLINE ONLINE testdb4
3)备份OCR
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -manualbackup
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -showbackup
4)将OCR增加到磁盘组中并删除原有文件系统中的OCR
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -add +OCRVOTE
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3336
Available space (kbytes) : 258784
ID : 1187520997
Device/File Name : /ocrvote/ocr1
Device/File integrity check succeeded
Device/File Name : +OCRVOTE
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -delete /ocrvote/ocr1
root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3336
Available space (kbytes) : 258784
ID : 1187520997
Device/File Name : +OCRVOTE
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
5)将votedisk迁移至文件系统中
root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE a948649dc0e14f65bf171ba2ca496962 (/ocrvote/votedisk1) []
2. ONLINE a5f290d560684f47bf82eb3d34db5fc7 (/ocrvote/votedisk2) []
3. ONLINE 49617fb984fc4fcdbf5b7566a9e1778f (/ocrvote/votedisk3) []
Located 3 voting disk(s).
root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl replace votedisk +OCRVOTE
CRS-4256: Updating the profile
Successful addition of voting disk 3a5e5e8622024f17bf0c1a4594e303f5.
Successful addition of voting disk 92ff4555f7064f70bf3c022bd687dbc5.
Successful addition of voting disk 19a1fed74b7f4fb6bf780d43b5427dc9.
Successful deletion of voting disk a948649dc0e14f65bf171ba2ca496962.
Successful deletion of voting disk a5f290d560684f47bf82eb3d34db5fc7.
Successful deletion of voting disk 49617fb984fc4fcdbf5b7566a9e1778f.
Successfully replaced voting disk group with +OCRVOTE.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 3a5e5e8622024f17bf0c1a4594e303f5 (/dev/vx/rdmp/emc0_04a1) [OCRVOTE]
2. ONLINE 92ff4555f7064f70bf3c022bd687dbc5 (/dev/vx/rdmp/hitachi_vsp0_11cc) [OCRVOTE]
3. ONLINE 19a1fed74b7f4fb6bf780d43b5427dc9 (/dev/vx/rdmp/emc0_04c1) [OCRVOTE]
Located 3 voting disk(s).
3.将NFS添加至磁盘组OCTVOTE中,作为第三块仲裁盘
asm extend rac需要在2套存储之外的地方放置一台linux的pc server,并需要在这台server上创建一个文件系统。 将此文件系统以NFS形式挂载到asm extend rac的服务器端,NFS上需要使用dd命令生成盘。
3.1NFS服务器信息
系统版本:Linux el5 x86_64
3.2NFS服务器创建grid用户
groupadd -g 1000 oinstall
groupadd -g 1100 asmadmin
useradd -u 1100 -g oinstall -G oinstall,asmadmin -d /home/grid -c "GRID Software Owner" grid
注:建议nfs服务器用户ID、组ID跟生产库一致
3.3在NFS服务器创建目录并赋权,DD出一个盘
cd /oradata
mkdir votedisk
chown 1100:1100 votedisk
3.4修改NFS服务器上的/etc/exports文件,并重启NFS
vi /etc/exports
新增如下行
/oradata/votedisk *(rw,sync,all_squash,anonuid=1100,anongid=1100)
service nfs stop
service nfs start
3.5查看nfs是否包含新增的votedisk目录
[root@ywtcdb ~]# exportfs -v
/oradata 100.15.64.*(rw,wdelay,no_root_squash,no_subtree_check,anonuid=65534,anongid=65534)
/oradata/votedisk
(注:红色部分为新增部分)
3.6修改生产主机的/etc/filesystems文件,将目录设为自动随机挂载(每个节点运行)
su - root
mkdir /voting_disk
chown grid:asmadmin /voting_disk
vi /etc/filesystems
新增如下内容:
/voting_disk:
dev = "/oradata/votedisk"
vfs = nfs
nodename = ywtcdb
mount = true
options = rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys
account = false
(注:严格按照/etc/filesystems的已有选项进行配置,包括标点符号,空格等,建议使用smit nfs命令进行nfs配置,并在命令配置完成之后修改/etc/filesystems文件中对应挂载目录的options属性,options属性必须是rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys)
使用smit nfs命令设置启动自动挂载nfs
#smit nfs
[TOP] [Entry Fields]
* Pathname of mount point [/voting_disk]
* Pathname of remote directory [/oradata/votedisk]
* Host where remote directory resides [ywtcdb]
Mount type name []
* Security method [sys]
* Mount now, add entry to /etc/filesystems or both? both
* /etc/filesystems entry will mount the directory yes
3.7手动挂载目录(每个节点运行)
/usr/sbin/nfso -p -o nfs_use_reserved_ports=1
或nfso -p -o nfs_use_reserved_ports=1
su - root
mount -v nfs -o rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys 100.15.57.125:/oradata/votedisk /voting_disk
注:命令中的100.15.57.125问NFS服务器的IP, /oradata/votedisk为NFS服务器的目录,/voting_disk为生产主机的目录。
3.8使用dd命令生成一块盘(任一生产节点)
dd if=/dev/zero of=/voting_disk/vote_disk_nfs bs=1M count=1000
3.9将新生成盘加到磁盘组OCRVOTE中
su - grid
export DISPLAY=100.15.70.169:0.0
asmca
在asmca中要先改变Disk Discovery Path
修改前:
/dev/vx/rdmp/*
修改后:
/voting_disk/vote_disk_nfs, /dev/vx/rdmp/*
将盘/voting_disk/vote_disk_nfs加到磁盘组OCRVOTE中的一个新的故障组中,添加完成之后我们可以看到磁盘组OCRVOTE有3个故障组。
3.10检查votedisk是否在新增盘上
$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 89210622f0864ff0bf9517205691e679 (/voting_disk/vote_disk_nfs) [OCRVOTE]
2. ONLINE 55c4ee685a824ff3bf6ce510bf09468e (/dev/vx/rdmp/remc0_04a1) [OCRVOTE]
3. ONLINE 159234e88fe64f55bf0d4571362c3b07 (/dev/vx/rdmp/ rhitachi_v0_11cd) [OCRVOTE]
Located 3 voting disk(s).
3.11开始建库,建库完成之后,至此ASM EXTEND RAC创建完成
4.ASM EXTEND RAC高可用测试
4.1 拔掉节点1、节点2的EMC存储光纤,模拟一边存储宕掉
css日志如下:
节点1::
2014-05-20 14:46:44.886:
[cssd(4129042)]CRS-1649:An I/O error occured for voting file: /dev/remc0_04a5; details at (:CSSNM00060:) in /oracle/app/11.2.0/grid/log/testdb1/cssd/ocssd.log.
2014-05-20 14:46:44.886:
[cssd(4129042)]CRS-1649:An I/O error occured for voting file: /dev/remc0_04a5; details at (:CSSNM00059:) in /oracle/app/11.2.0/grid/log/testdb1/cssd/ocssd.log.
2014-05-20 14:46:46.051:
[cssd(4129042)]CRS-1626:A Configuration change request completed successfully
2014-05-20 14:46:46.071:
[cssd(4129042)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .
节点2:
2014-05-20 14:46:46.053:
[cssd(4195026)]CRS-1604:CSSD voting file is offline: /dev/remc0_04a5; details at (:CSSNM00069:) in /oracle/app/11.2.0/grid/log/testdb2/cssd/ocssd.log.
2014-05-20 14:46:46.053:
[cssd(4195026)]CRS-1626:A Configuration change request completed successfully
2014-05-20 14:46:46.071:
[cssd(4195026)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .
节点3:
2014-05-20 14:46:46.053:
[cssd(3604942)]CRS-1604:CSSD voting file is offline: /dev/remc0_04a5; details at (:CSSNM00069:) in /oracle/app/11.2.0/grid/log/testdb3/cssd/ocssd.log.
2014-05-20 14:46:46.053:
[cssd(3604942)]CRS-1626:A Configuration change request completed successfully
2014-05-20 14:46:46.074:
[cssd(3604942)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .
节点4:
2014-05-20 14:46:46.053:
[cssd(3015132)]CRS-1604:CSSD voting file is offline: /dev/remc0_04a5; details at (:CSSNM00069:) in /oracle/app/11.2.0/grid/log/testdb4/cssd/ocssd.log.
2014-05-20 14:46:46.053:
[cssd(3015132)]CRS-1626:A Configuration change request completed successfully
2014-05-20 14:46:46.073:
[cssd(3015132)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .
CRS状态正常:
testdb3:/oracle/app/11.2.0/grid/log/testdb3/cssd(testdb3)$/oracle/app/11.2.0/grid/bin/crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.LISTENER.lsnr
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.OCRVOTE.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.SYSDG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.asm
ONLINE ONLINE testdb1 Started
ONLINE ONLINE testdb2 Started
ONLINE ONLINE testdb3 Started
ONLINE ONLINE testdb4 Started
ora.gsd
OFFLINE OFFLINE testdb1
OFFLINE OFFLINE testdb2
OFFLINE OFFLINE testdb3
OFFLINE OFFLINE testdb4
ora.net1.network
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.ons
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.registry.acfs
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testdb4
ora.cvu
1 ONLINE ONLINE testdb3
ora.oc4j
1 ONLINE ONLINE testdb3
ora.scan1.vip
1 ONLINE ONLINE testdb4
ora.testdb.db
1 ONLINE ONLINE testdb1 Open
2 ONLINE ONLINE testdb2 Open
3 ONLINE ONLINE testdb3 Open
4 ONLINE ONLINE testdb4 Open
ora.testdb1.vip
1 ONLINE ONLINE testdb1
ora.testdb2.vip
1 ONLINE ONLINE testdb2
ora.testdb3.vip
1 ONLINE ONLINE testdb3
ora.testdb4.vip
1 ONLINE ONLINE testdb4
查看votedisk如下:
$ /oracle/app/11.2.0/grid/bin/crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 8a31ddf5013d4fb1bfdbb01d6fc6eb7b (/dev/rhitachi_v0_11cc) [OCRVOTE]
2. ONLINE 1ef9486d54b24f8cbf07814d2848a009 (/voting_disk/vote_disk_nfs) [OCRVOTE]
Located 2 voting disk(s).
当把存储光纤插回去之后手动online磁盘,两边存储会自动同步数据
alter diskgroup SYSDG online disks in failgroup fail_1;
alter diskgroup DATADG online disks in failgroup fail_1;
测试结果所有EMC存储在各节点ASM磁盘组中都自动OFFLINE,保留HDS存储,各节点实例正常。在测试中我们拔掉hds存储光纤,现象跟拔掉EMS存储光纤一致。由此可以得出:当一边存储宕掉之后,ASM EXTEND RAC保留好的那边存储,各节点实例均正常。当把存储光纤插回去之后手动online磁盘,两边存储会自动同步数据。
注:存放votedisk的磁盘组在磁盘挂回来之后会自动online磁盘
4.2 reboot节点1、2主机,模拟主机突然宕掉故障
当reboot节点1、2主机,查看crs资源状态如下:
$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHDG.dg
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.DATADG.dg
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.LISTENER.lsnr
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.OCRVOTE.dg
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.SYSDG.dg
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.asm
ONLINE ONLINE testdb3 Started
ONLINE ONLINE testdb4 Started
ora.gsd
OFFLINE OFFLINE testdb3
OFFLINE OFFLINE testdb4
ora.net1.network
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.ons
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.registry.acfs
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testdb3
ora.cvu
1 ONLINE ONLINE testdb3
ora.oc4j
1 ONLINE ONLINE testdb3
ora.scan1.vip
1 ONLINE ONLINE testdb3
ora.testdb.db
1 ONLINE OFFLINE
2 ONLINE OFFLINE
3 ONLINE ONLINE testdb3 Open
4 ONLINE ONLINE testdb4 Open
ora.testdb1.vip
1 ONLINE INTERMEDIATE testdb4 FAILED OVER
ora.testdb2.vip
1 ONLINE INTERMEDIATE testdb3 FAILED OVER
ora.testdb3.vip
1 ONLINE ONLINE testdb3
ora.testdb4.vip
1 ONLINE ONLINE testdb4
当节点1、2主机起来之后,在查看CRS状态如下:
$crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.LISTENER.lsnr
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.OCRVOTE.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.SYSDG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.asm
ONLINE ONLINE testdb1 Started
ONLINE ONLINE testdb2 Started
ONLINE ONLINE testdb3 Started
ONLINE ONLINE testdb4 Started
ora.gsd
OFFLINE OFFLINE testdb1
OFFLINE OFFLINE testdb2
OFFLINE OFFLINE testdb3
OFFLINE OFFLINE testdb4
ora.net1.network
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.ons
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.registry.acfs
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testdb3
ora.cvu
1 ONLINE ONLINE testdb3
ora.oc4j
1 ONLINE ONLINE testdb4
ora.scan1.vip
1 ONLINE ONLINE testdb3
ora.testdb.db
1 ONLINE ONLINE testdb1 Open
2 ONLINE ONLINE testdb2 Open
3 ONLINE ONLINE testdb3 Open
4 ONLINE ONLINE testdb4 Open
ora.testdb1.vip
1 ONLINE ONLINE testdb1
ora.testdb2.vip
1 ONLINE ONLINE testdb2
ora.testdb3.vip
1 ONLINE ONLINE testdb3
ora.testdb4.vip
1 ONLINE ONLINE testdb4
测试结果当宕掉1个或多个节点时,其VIP会飘至正常节点,所有客户端重连接到可用节点,当测试主机重启完成之后,CRS会自动拉起,且VIP会正常回飘。
4.3 模拟public网络中断
由于主机做了虚拟化,无法拔除网线。使用命令ifconfig en1 down宕掉节点1 public ip所在的网卡进行测试
1)查看节点1发现公有IP、VIP及SCAN IP均在网卡en1上。
root@testdb1:/#netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en1 1500 link#2 0.14.5e.79.5c.ca 5153732 0 4066346 2 0
en1 1500 100.15.64 100.15.64.180 5153732 0 4066346 2 0
en1 1500 100.15.64 100.15.64.184 5153732 0 4066346 2 0
en1 1500 100.15.64 100.15.64.188 5153732 0 4066346 2 0
en2 1500 link#3 0.14.5e.79.5b.e6 40305463 0 44224443 2 0
en2 1500 7.154.64 7.154.64.1 40305463 0 44224443 2 0
en2 1500 169.254 169.254.78.30 40305463 0 44224443 2 0
lo0 16896 link#1 2316784 0 2316787 0 0
lo0 16896 127 127.0.0.1 2316784 0 2316787 0 0
lo0 16896 ::1%1 2316784 0 2316787 0 0
2)使用命令ifconfig en1 down进行测试
root@testdb1:/oracle/app/11.2.0/grid/bin#ifconfig en1 down
3)查看crs资源状态发现vip,scan ip均已飘至正常节点
testdb3:/home/oracle(testdb3)$crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.LISTENER.lsnr
ONLINE OFFLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.OCRVOTE.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.SYSDG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.asm
ONLINE ONLINE testdb1 Started
ONLINE ONLINE testdb2 Started
ONLINE ONLINE testdb3 Started
ONLINE ONLINE testdb4 Started
ora.gsd
OFFLINE OFFLINE testdb1
OFFLINE OFFLINE testdb2
OFFLINE OFFLINE testdb3
OFFLINE OFFLINE testdb4
ora.net1.network
ONLINE OFFLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.ons
ONLINE OFFLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.registry.acfs
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testdb2
ora.cvu
1 ONLINE ONLINE testdb2
ora.oc4j
1 ONLINE ONLINE testdb4
ora.scan1.vip
1 ONLINE ONLINE testdb2
ora.testdb.db
1 ONLINE ONLINE testdb1 Open
2 ONLINE ONLINE testdb2 Open
3 ONLINE ONLINE testdb3 Open
4 ONLINE ONLINE testdb4 Open
ora.testdb1.vip
1 ONLINE INTERMEDIATE testdb4 FAILED OVER
ora.testdb2.vip
1 ONLINE ONLINE testdb2
ora.testdb3.vip
1 ONLINE ONLINE testdb3
ora.testdb4.vip
1 ONLINE ONLINE testdb4
4)将节点1的en1网卡启起来
root@testdb1:/#ifconfig en1 up
5)查看crs资源状态发现vip正常回飘
testdb3:/home/oracle(testdb3)$crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.LISTENER.lsnr
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.OCRVOTE.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.SYSDG.dg
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.asm
ONLINE ONLINE testdb1 Started
ONLINE ONLINE testdb2 Started
ONLINE ONLINE testdb3 Started
ONLINE ONLINE testdb4 Started
ora.gsd
OFFLINE OFFLINE testdb1
OFFLINE OFFLINE testdb2
OFFLINE OFFLINE testdb3
OFFLINE OFFLINE testdb4
ora.net1.network
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.ons
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
ora.registry.acfs
ONLINE ONLINE testdb1
ONLINE ONLINE testdb2
ONLINE ONLINE testdb3
ONLINE ONLINE testdb4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE testdb2
ora.cvu
1 ONLINE ONLINE testdb2
ora.oc4j
1 ONLINE ONLINE testdb4
ora.scan1.vip
1 ONLINE ONLINE testdb2
ora.testdb.db
1 ONLINE ONLINE testdb1 Open
2 ONLINE ONLINE testdb2 Open
3 ONLINE ONLINE testdb3 Open
4 ONLINE ONLINE testdb4 Open
ora.testdb1.vip
1 ONLINE ONLINE testdb1
ora.testdb2.vip
1 ONLINE ONLINE testdb2
ora.testdb3.vip
1 ONLINE ONLINE testdb3
ora.testdb4.vip
1 ONLINE ONLINE testdb4
测试结果测试节点(节点1)监听停止,SCAN LISTENER原来在该节点运行,已漂移到其他可用节点,测试节点 VIP漂移到其他可用节点,当网卡起来之后(public网络恢复正常),VIP正常回飘,测试节点监听自动online,SCAN LISTENER及scan VIP没回飘。而后我们依次测试宕掉其他节点的public IP所在网卡,发现SCAN LISTENER漂移至instance_number最小的节点,而vip随机漂移。
4.4 宕掉监听测试
通过kill监听进程实现
测试结果原有连接没有收到影响,新的连接不能连到该节点实例,应用通过TAF或自动重连到另一节点
监听进程自动重新启动
4.5 数据库单个实例crash测试
通过kill pmon进程实现
测试结果kill pmon进程后,数据库实例crash,并且实例自动重启,重启完成后会话自动重新连接
4.6 模拟CSSD进程crash
通过kill cssd进程实现
测试结果kill cssd进程后,该节点重启,VIP飘至其他正常节点,主机启动完成后CRS自动拉起,集群重新配置。
4.7 模拟CRSD进程crash
通过kill crsd进程实现
测试结果kill crsd.bin进程后,一分钟内该进程自动拉起。原理:crsd进程crash将会被orarootagent检测到,同时crsd进程会被自动重启。
4.8 模拟EVMD进程crash
通过kill evmd进程实现
测试结果kill evmd.bin进程后,一分钟内该进程自动拉起。原理:evmd进程crash将被ohasd进程检测到,evmd、orarootagent和crsd进程将会被重启