千家信息网

ORACLE11GR2 RAC文件系统变更成ASM EXTEND RAC及高可用测试

发表于:2024-12-13 作者:千家信息网编辑
千家信息网最后更新 2024年12月13日,本来一直都有玩下ASM EXTEND RAC这样的想法,苦于没有资源测试,等。。。。。老天不负有心人啊~哈哈!终于有资源玩了。2套存储:EMS跟HDS,分别放在不同的机房。由于原测试系统用的是文件系统
千家信息网最后更新 2024年12月13日ORACLE11GR2 RAC文件系统变更成ASM EXTEND RAC及高可用测试


本来一直都有玩下ASM EXTEND RAC这样的想法,苦于没有资源测试,等。。。。。
老天不负有心人啊~哈哈!终于有资源玩了。
2套存储:EMS跟HDS,分别放在不同的机房。
由于原测试系统用的是文件系统,故要将其先改为ASM,再创建ASM EXTEND RAC。
此次修改成ASM EXTEND RAC遇到一系列问题,虽然解决这些问题有过苦恼,但EXTEND RAC成功完成之后,有种莫名的成就感,
各位看官大问题解决之后有木有同感~ 呵呵

1.系统环境

1.1 OS及DB版本

主机OS版本:AIX 7.1 ("7100-02-03-1334")

ORACLE版本:oracle 11.2.0.3 PSU10

是否RAC:是

节点个数:4

存储:HDS 100G,EMS 50G

ASM或文件系统:赛门铁克VERITAS卷管理工具搭建集群文件系统

1.2 硬件

RAM : 128

SWAP: 13G

1.3 AIX /TMP文件系统

8GB

1.4 AIX JDK & JRE

IBM JDK 1.6.0.00 (64 BIT)

1.5 目录详细

/oracle 50GB

/oraclelog 30GB

/ocrvote 2G

/archivelog 400G

/oradata 850

1.6 主机IP配置信息

100.15.64.180 testdb1

100.15.64.181 testdb2

100.15.64.182 testdb3

100.15.64.183 testdb4

100.15.64.184 testdb1-vip

100.15.64.185 testdb2-vip

100.15.64.186 testdb3-vip

100.15.64.187 testdb4-vip

100.15.64.188 testdb-scan

7.154.64.1 testdb1-priv

7.154.64.2 testdb2-priv

7.154.64.3 testdb3-priv

7.154.64.4 testdb4-priv


2文件系统更换成ASM

2.1磁盘权限及属性修改

chown grid:asmadmin /dev/vx/rdmp/remc0_04a1

chown grid:asmadmin /dev/vx/rdmp/rhitachi_v0_11cd

chmod 660 /dev/vx/rdmp/remc0_04a1

chmod 660 /dev/vx/rdmp/rhitachi_v0_11cd

(注:由于测试库使用的是赛门铁克的存储多路径软件,故无需修改磁盘属性)

2.2创建ASM实例

su - grid

export DISPLAY=100.15.70.169:0.0

asmca

(注:创建OCTVOTE磁盘组选NORMAL冗余,创建2个故障组,最少3块磁盘,建议选用3块磁盘,当asm的故障组如果有多余3块盘,votedisk迁移到这个磁盘组也只用其中的3块盘。使用crsctl query css votedisk只看到votedisk放在3块盘上。磁盘组的可用空间以其故障组总大小最小的为准)

2.3创建ASM磁盘组SYSDG,DATADG并修改磁盘组参数

su - grid

export DISPLAY=100.15.70.169:0.0

asmca



注:同一边的存储放在一个故障组中。

oracle 11G之后的ASM需要将rdbmscompatible参数修改为11.2.0.0,这个参数默认的是10.2.0.0,如果这个参数不修改,后面如果使用两个故障组,其中一个故障组故障修复后,将故障组在线online的时候会报如下错:

ORA-15283: ASM operation requires compatible.rdbms of 11.1.0.0.0 or higher

修改命令:

alter diskgroup SYSDG set attribute 'compatible.rdbms'='11.2.0.0';

select name,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup;

----compatibility对应asm的版本,

DATABASE_COMPATIBILITY --- 兼容数据库版本

2.4将文件系统数据文件迁移至ASM

因为本次测试没建库,所以不涉及数据文件迁移,如需迁移,使用RMAN实现。

2.5OCR,VOTEDISK迁移至磁盘组OCRVOTE

1)查看ocrvotedisk

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck

Status of Oracle Cluster Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 3296

Available space (kbytes) : 258824

ID : 1187520997

Device/File Name : /ocrvote/ocr1

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE a948649dc0e14f65bf171ba2ca496962 (/ocrvote/votedisk1) []

2. ONLINE a5f290d560684f47bf82eb3d34db5fc7 (/ocrvote/votedisk2) []

3. ONLINE 49617fb984fc4fcdbf5b7566a9e1778f (/ocrvote/votedisk3) []

Located 3 voting disk(s).

2)查看资源状态

$ crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.LISTENER.lsnr

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.OCRVOTE.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.SYSDG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.asm

ONLINE ONLINE testdb1 Started

ONLINE ONLINE testdb2 Started

ONLINE ONLINE testdb3 Started

ONLINE ONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.ons

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.registry.acfs

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE testdb1

ora.cvu

1 ONLINE ONLINE testdb1

ora.oc4j

1 ONLINE ONLINE testdb1

ora.scan1.vip

1 ONLINE ONLINE testdb1

ora.testdb1.vip

1 ONLINE ONLINE testdb1

ora.testdb2.vip

1 ONLINE ONLINE testdb2

ora.testdb3.vip

1 ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINE ONLINE testdb4

3)备份OCR

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -manualbackup

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -showbackup

4)OCR增加到磁盘组中并删除原有文件系统中的OCR

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -add +OCRVOTE

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck

Status of Oracle Cluster Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 3336

Available space (kbytes) : 258784

ID : 1187520997

Device/File Name : /ocrvote/ocr1

Device/File integrity check succeeded

Device/File Name : +OCRVOTE

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrconfig -delete /ocrvote/ocr1

root@testdb1:/#/oracle/app/11.2.0/grid/bin/ocrcheck

Status of Oracle Cluster Registry is as follows :

Version : 3

Total space (kbytes) : 262120

Used space (kbytes) : 3336

Available space (kbytes) : 258784

ID : 1187520997

Device/File Name : +OCRVOTE

Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

5)votedisk迁移至文件系统中

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE a948649dc0e14f65bf171ba2ca496962 (/ocrvote/votedisk1) []

2. ONLINE a5f290d560684f47bf82eb3d34db5fc7 (/ocrvote/votedisk2) []

3. ONLINE 49617fb984fc4fcdbf5b7566a9e1778f (/ocrvote/votedisk3) []

Located 3 voting disk(s).

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl replace votedisk +OCRVOTE

CRS-4256: Updating the profile

Successful addition of voting disk 3a5e5e8622024f17bf0c1a4594e303f5.

Successful addition of voting disk 92ff4555f7064f70bf3c022bd687dbc5.

Successful addition of voting disk 19a1fed74b7f4fb6bf780d43b5427dc9.

Successful deletion of voting disk a948649dc0e14f65bf171ba2ca496962.

Successful deletion of voting disk a5f290d560684f47bf82eb3d34db5fc7.

Successful deletion of voting disk 49617fb984fc4fcdbf5b7566a9e1778f.

Successfully replaced voting disk group with +OCRVOTE.

CRS-4256: Updating the profile

CRS-4266: Voting file(s) successfully replaced

root@testdb1:/#/oracle/app/11.2.0/grid/bin/crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 3a5e5e8622024f17bf0c1a4594e303f5 (/dev/vx/rdmp/emc0_04a1) [OCRVOTE]

2. ONLINE 92ff4555f7064f70bf3c022bd687dbc5 (/dev/vx/rdmp/hitachi_vsp0_11cc) [OCRVOTE]

3. ONLINE 19a1fed74b7f4fb6bf780d43b5427dc9 (/dev/vx/rdmp/emc0_04c1) [OCRVOTE]

Located 3 voting disk(s).

3.NFS添加至磁盘组OCTVOTE中,作为第三块仲裁盘

asm extend rac需要在2套存储之外的地方放置一台linuxpc server,并需要在这台server上创建一个文件系统。 将此文件系统以NFS形式挂载到asm extend rac的服务器端,NFS上需要使用dd命令生成盘。

3.1NFS服务器信息

系统版本:Linux el5 x86_64

3.2NFS服务器创建grid用户

groupadd -g 1000 oinstall

groupadd -g 1100 asmadmin

useradd -u 1100 -g oinstall -G oinstall,asmadmin -d /home/grid -c "GRID Software Owner" grid

注:建议nfs服务器用户ID、组ID跟生产库一致

3.3NFS服务器创建目录并赋权,DD出一个盘

cd /oradata

mkdir votedisk

chown 1100:1100 votedisk

3.4修改NFS服务器上的/etc/exports文件,并重启NFS

vi /etc/exports

新增如下行

/oradata/votedisk *(rw,sync,all_squash,anonuid=1100,anongid=1100)

service nfs stop

service nfs start

3.5查看nfs是否包含新增的votedisk目录

[root@ywtcdb ~]# exportfs -v

/oradata 100.15.64.*(rw,wdelay,no_root_squash,no_subtree_check,anonuid=65534,anongid=65534)

/oradata/votedisk

(rw,wdelay,root_squash,all_squash,no_subtree_check,anonuid=1100,anongid=1100)

(注:红色部分为新增部分)

3.6修改生产主机的/etc/filesystems文件,将目录设为自动随机挂载(每个节点运行)

su - root

mkdir /voting_disk

chown grid:asmadmin /voting_disk

vi /etc/filesystems

新增如下内容:

/voting_disk:

dev = "/oradata/votedisk"

vfs = nfs

nodename = ywtcdb

mount = true

options = rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys

account = false

(注:严格按照/etc/filesystems的已有选项进行配置,包括标点符号,空格等,建议使用smit nfs命令进行nfs配置,并在命令配置完成之后修改/etc/filesystems文件中对应挂载目录的options属性,options属性必须是rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys)

使用smit nfs命令设置启动自动挂载nfs

#smit nfs

[TOP] [Entry Fields]

* Pathname of mount point [/voting_disk]

* Pathname of remote directory [/oradata/votedisk]

* Host where remote directory resides [ywtcdb]

Mount type name []

* Security method [sys]

* Mount now, add entry to /etc/filesystems or both? both

* /etc/filesystems entry will mount the directory yes

3.7手动挂载目录(每个节点运行)

/usr/sbin/nfso -p -o nfs_use_reserved_ports=1

nfso -p -o nfs_use_reserved_ports=1

su - root

mount -v nfs -o rw,bg,hard,intr,rsize=32768,wsize=32768,timeo=600,vers=3,proto=tcp,noac,sec=sys 100.15.57.125:/oradata/votedisk /voting_disk

:命令中的100.15.57.125问NFS服务器的IP, /oradata/votediskNFS服务器的目录,/voting_disk为生产主机的目录。

3.8使用dd命令生成一块盘(任一生产节点)

dd if=/dev/zero of=/voting_disk/vote_disk_nfs bs=1M count=1000

3.9将新生成盘加到磁盘组OCRVOTE

su - grid

export DISPLAY=100.15.70.169:0.0

asmca

asmca中要先改变Disk Discovery Path

修改前:

/dev/vx/rdmp/*

修改后:

/voting_disk/vote_disk_nfs, /dev/vx/rdmp/*

将盘/voting_disk/vote_disk_nfs加到磁盘组OCRVOTE中的一个新的故障组中,添加完成之后我们可以看到磁盘组OCRVOTE3个故障组。



3.10检查votedisk是否在新增盘上

$ crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 89210622f0864ff0bf9517205691e679 (/voting_disk/vote_disk_nfs) [OCRVOTE]

2. ONLINE 55c4ee685a824ff3bf6ce510bf09468e (/dev/vx/rdmp/remc0_04a1) [OCRVOTE]

3. ONLINE 159234e88fe64f55bf0d4571362c3b07 (/dev/vx/rdmp/ rhitachi_v0_11cd) [OCRVOTE]

Located 3 voting disk(s).

3.11开始建库,建库完成之后,至此ASM EXTEND RAC创建完成

4.ASM EXTEND RAC高可用测试

4.1 拔掉节点1、节点2EMC存储光纤,模拟一边存储宕掉

css日志如下:

节点1:

2014-05-20 14:46:44.886:

[cssd(4129042)]CRS-1649:An I/O error occured for voting file: /dev/remc0_04a5; details at (:CSSNM00060:) in /oracle/app/11.2.0/grid/log/testdb1/cssd/ocssd.log.

2014-05-20 14:46:44.886:

[cssd(4129042)]CRS-1649:An I/O error occured for voting file: /dev/remc0_04a5; details at (:CSSNM00059:) in /oracle/app/11.2.0/grid/log/testdb1/cssd/ocssd.log.

2014-05-20 14:46:46.051:

[cssd(4129042)]CRS-1626:A Configuration change request completed successfully

2014-05-20 14:46:46.071:

[cssd(4129042)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

节点2

2014-05-20 14:46:46.053:

[cssd(4195026)]CRS-1604:CSSD voting file is offline: /dev/remc0_04a5; details at (:CSSNM00069:) in /oracle/app/11.2.0/grid/log/testdb2/cssd/ocssd.log.

2014-05-20 14:46:46.053:

[cssd(4195026)]CRS-1626:A Configuration change request completed successfully

2014-05-20 14:46:46.071:

[cssd(4195026)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

节点3

2014-05-20 14:46:46.053:

[cssd(3604942)]CRS-1604:CSSD voting file is offline: /dev/remc0_04a5; details at (:CSSNM00069:) in /oracle/app/11.2.0/grid/log/testdb3/cssd/ocssd.log.

2014-05-20 14:46:46.053:

[cssd(3604942)]CRS-1626:A Configuration change request completed successfully

2014-05-20 14:46:46.074:

[cssd(3604942)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

节点4

2014-05-20 14:46:46.053:

[cssd(3015132)]CRS-1604:CSSD voting file is offline: /dev/remc0_04a5; details at (:CSSNM00069:) in /oracle/app/11.2.0/grid/log/testdb4/cssd/ocssd.log.

2014-05-20 14:46:46.053:

[cssd(3015132)]CRS-1626:A Configuration change request completed successfully

2014-05-20 14:46:46.073:

[cssd(3015132)]CRS-1601:CSSD Reconfiguration complete. Active nodes are testdb1 testdb2 testdb3 testdb4 .

CRS状态正常:

testdb3:/oracle/app/11.2.0/grid/log/testdb3/cssd(testdb3)$/oracle/app/11.2.0/grid/bin/crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.LISTENER.lsnr

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.OCRVOTE.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.SYSDG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.asm

ONLINE ONLINE testdb1 Started

ONLINE ONLINE testdb2 Started

ONLINE ONLINE testdb3 Started

ONLINE ONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.ons

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.registry.acfs

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE testdb4

ora.cvu

1 ONLINE ONLINE testdb3

ora.oc4j

1 ONLINE ONLINE testdb3

ora.scan1.vip

1 ONLINE ONLINE testdb4

ora.testdb.db

1 ONLINE ONLINE testdb1 Open

2 ONLINE ONLINE testdb2 Open

3 ONLINE ONLINE testdb3 Open

4 ONLINE ONLINE testdb4 Open

ora.testdb1.vip

1 ONLINE ONLINE testdb1

ora.testdb2.vip

1 ONLINE ONLINE testdb2

ora.testdb3.vip

1 ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINE ONLINE testdb4

查看votedisk如下:

$ /oracle/app/11.2.0/grid/bin/crsctl query css votedisk

## STATE File Universal Id File Name Disk group

-- ----- ----------------- --------- ---------

1. ONLINE 8a31ddf5013d4fb1bfdbb01d6fc6eb7b (/dev/rhitachi_v0_11cc) [OCRVOTE]

2. ONLINE 1ef9486d54b24f8cbf07814d2848a009 (/voting_disk/vote_disk_nfs) [OCRVOTE]

Located 2 voting disk(s).

当把存储光纤插回去之后手动online磁盘,两边存储会自动同步数据

alter diskgroup SYSDG online disks in failgroup fail_1;

alter diskgroup DATADG online disks in failgroup fail_1;

测试结果

所有EMC存储在各节点ASM磁盘组中都自动OFFLINE,保留HDS存储,各节点实例正常。在测试中我们拔掉hds存储光纤,现象跟拔掉EMS存储光纤一致。由此可以得出:当一边存储宕掉之后,ASM EXTEND RAC保留好的那边存储,各节点实例均正常。当把存储光纤插回去之后手动online磁盘,两边存储会自动同步数据。

注:存放votedisk的磁盘组在磁盘挂回来之后会自动online磁盘

4.2 reboot节点12主机,模拟主机突然宕掉故障

reboot节点12主机,查看crs资源状态如下:

$ crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ARCHDG.dg

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.DATADG.dg

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.LISTENER.lsnr

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.OCRVOTE.dg

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.SYSDG.dg

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.asm

ONLINE ONLINE testdb3 Started

ONLINE ONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.ons

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.registry.acfs

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE testdb3

ora.cvu

1 ONLINE ONLINE testdb3

ora.oc4j

1 ONLINE ONLINE testdb3

ora.scan1.vip

1 ONLINE ONLINE testdb3

ora.testdb.db

1 ONLINE OFFLINE

2 ONLINE OFFLINE

3 ONLINE ONLINE testdb3 Open

4 ONLINE ONLINE testdb4 Open

ora.testdb1.vip

1 ONLINE INTERMEDIATE testdb4 FAILED OVER

ora.testdb2.vip

1 ONLINE INTERMEDIATE testdb3 FAILED OVER

ora.testdb3.vip

1 ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINE ONLINE testdb4

当节点12主机起来之后,在查看CRS状态如下:

$crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.LISTENER.lsnr

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.OCRVOTE.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.SYSDG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.asm

ONLINE ONLINE testdb1 Started

ONLINE ONLINE testdb2 Started

ONLINE ONLINE testdb3 Started

ONLINE ONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.ons

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.registry.acfs

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE testdb3

ora.cvu

1 ONLINE ONLINE testdb3

ora.oc4j

1 ONLINE ONLINE testdb4

ora.scan1.vip

1 ONLINE ONLINE testdb3

ora.testdb.db

1 ONLINE ONLINE testdb1 Open

2 ONLINE ONLINE testdb2 Open

3 ONLINE ONLINE testdb3 Open

4 ONLINE ONLINE testdb4 Open

ora.testdb1.vip

1 ONLINE ONLINE testdb1

ora.testdb2.vip

1 ONLINE ONLINE testdb2

ora.testdb3.vip

1 ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINE ONLINE testdb4

测试结果

当宕掉1个或多个节点时,其VIP会飘至正常节点,所有客户端重连接到可用节点,当测试主机重启完成之后,CRS会自动拉起,且VIP会正常回飘。

4.3 模拟public网络中断

由于主机做了虚拟化,无法拔除网线。使用命令ifconfig en1 down宕掉节点1 public ip所在的网卡进行测试

1)查看节点1发现公有IPVIPSCAN IP均在网卡en1上。

root@testdb1:/#netstat -in

Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll

en1 1500 link#2 0.14.5e.79.5c.ca 5153732 0 4066346 2 0

en1 1500 100.15.64 100.15.64.180 5153732 0 4066346 2 0

en1 1500 100.15.64 100.15.64.184 5153732 0 4066346 2 0

en1 1500 100.15.64 100.15.64.188 5153732 0 4066346 2 0

en2 1500 link#3 0.14.5e.79.5b.e6 40305463 0 44224443 2 0

en2 1500 7.154.64 7.154.64.1 40305463 0 44224443 2 0

en2 1500 169.254 169.254.78.30 40305463 0 44224443 2 0

lo0 16896 link#1 2316784 0 2316787 0 0

lo0 16896 127 127.0.0.1 2316784 0 2316787 0 0

lo0 16896 ::1%1 2316784 0 2316787 0 0

2)使用命令ifconfig en1 down进行测试

root@testdb1:/oracle/app/11.2.0/grid/bin#ifconfig en1 down

3)查看crs资源状态发现vipscan ip均已飘至正常节点

testdb3:/home/oracle(testdb3)$crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.LISTENER.lsnr

ONLINE OFFLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.OCRVOTE.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.SYSDG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.asm

ONLINE ONLINE testdb1 Started

ONLINE ONLINE testdb2 Started

ONLINE ONLINE testdb3 Started

ONLINE ONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE OFFLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.ons

ONLINE OFFLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.registry.acfs

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE testdb2

ora.cvu

1 ONLINE ONLINE testdb2

ora.oc4j

1 ONLINE ONLINE testdb4

ora.scan1.vip

1 ONLINE ONLINE testdb2

ora.testdb.db

1 ONLINE ONLINE testdb1 Open

2 ONLINE ONLINE testdb2 Open

3 ONLINE ONLINE testdb3 Open

4 ONLINE ONLINE testdb4 Open

ora.testdb1.vip

1 ONLINE INTERMEDIATE testdb4 FAILED OVER

ora.testdb2.vip

1 ONLINE ONLINE testdb2

ora.testdb3.vip

1 ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINE ONLINE testdb4

4)将节点1en1网卡启起来

root@testdb1:/#ifconfig en1 up

5)查看crs资源状态发现vip正常回飘

testdb3:/home/oracle(testdb3)$crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATADG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.LISTENER.lsnr

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.OCRVOTE.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.SYSDG.dg

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.asm

ONLINE ONLINE testdb1 Started

ONLINE ONLINE testdb2 Started

ONLINE ONLINE testdb3 Started

ONLINE ONLINE testdb4 Started

ora.gsd

OFFLINE OFFLINE testdb1

OFFLINE OFFLINE testdb2

OFFLINE OFFLINE testdb3

OFFLINE OFFLINE testdb4

ora.net1.network

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.ons

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

ora.registry.acfs

ONLINE ONLINE testdb1

ONLINE ONLINE testdb2

ONLINE ONLINE testdb3

ONLINE ONLINE testdb4

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE testdb2

ora.cvu

1 ONLINE ONLINE testdb2

ora.oc4j

1 ONLINE ONLINE testdb4

ora.scan1.vip

1 ONLINE ONLINE testdb2

ora.testdb.db

1 ONLINE ONLINE testdb1 Open

2 ONLINE ONLINE testdb2 Open

3 ONLINE ONLINE testdb3 Open

4 ONLINE ONLINE testdb4 Open

ora.testdb1.vip

1 ONLINE ONLINE testdb1

ora.testdb2.vip

1 ONLINE ONLINE testdb2

ora.testdb3.vip

1 ONLINE ONLINE testdb3

ora.testdb4.vip

1 ONLINE ONLINE testdb4

测试结果

测试节点(节点1)监听停止,SCAN LISTENER原来在该节点运行,已漂移到其他可用节点,测试节点 VIP漂移到其他可用节点,当网卡起来之后(public网络恢复正常)VIP正常回飘,测试节点监听自动onlineSCAN LISTENERscan VIP没回飘。而后我们依次测试宕掉其他节点的public IP所在网卡,发现SCAN LISTENER漂移至instance_number最小的节点,而vip随机漂移。

4.4 宕掉监听测试

通过kill监听进程实现

测试结果

原有连接没有收到影响,新的连接不能连到该节点实例,应用通过TAF或自动重连到另一节点

监听进程自动重新启动

4.5 数据库单个实例crash测试

通过kill pmon进程实现

测试结果

kill pmon进程后,数据库实例crash,并且实例自动重启,重启完成后会话自动重新连接

4.6 模拟CSSD进程crash

通过kill cssd进程实现

测试结果

kill cssd进程后,该节点重启,VIP飘至其他正常节点,主机启动完成后CRS自动拉起,集群重新配置。

4.7 模拟CRSD进程crash

通过kill crsd进程实现

测试结果

kill crsd.bin进程后,一分钟内该进程自动拉起。原理:crsd进程crash将会被orarootagent检测到,同时crsd进程会被自动重启。

4.8 模拟EVMD进程crash

通过kill evmd进程实现

测试结果

kill evmd.bin进程后,一分钟内该进程自动拉起。原理:evmd进程crash将被ohasd进程检测到,evmdorarootagentcrsd进程将会被重启


0