Oracle 11gR2 RAC添加新节点错误之--IP子网掩码
Oracle 11gR2 RAC添加新节点错误之--IP子网掩码
系统环境:
操作系统:RedHat EL5
Cluster: Oracle GI(Grid Infrastructure)
Oracle: Oracle 11.2.0.1.0
如图所示:RAC 系统架构
对于Oracle 11G构建RAC首先需要构建GI(Grid Infrastructure)的架构
案例分析:
在Oracle 11gR2 RAC在添加新节点时,添加grid,在node3上执行root.sh时,出现以下错误:
[root@lxh4 ~]# /u01/11.2.0/grid/root.sh
Running Oracle 11g root.sh script...The following environment variables are set as: ORACLE_OWNER= grid ORACLE_HOME= /u01/11.2.0/gridEnter the full pathname of the local bin directory: [/usr/local/bin]: Copying dbhome to /usr/local/bin ... Copying oraenv to /usr/local/bin ... Copying coraenv to /usr/local/bin ...Creating /etc/oratab file...Entries will be added to the /etc/oratab file as needed byDatabase Configuration Assistant when a database is createdFinished running generic part of root.sh script.Now product-specific root actions will be performed.2014-07-10 15:00:36: Parsing the host name2014-07-10 15:00:36: Checking for super user privileges2014-07-10 15:00:36: User has super user privilegesUsing configuration parameter file: /u01/11.2.0/grid/crs/install/crsconfig_paramsCreating trace directoryLOCAL ADD MODE Creating OCR keys for user 'root', privgrp 'root'..Operation successful.Adding daemon to inittabCRS-4123: Oracle High Availability Services has been started.ohasd is startingCRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node lxh2, number 1, and is terminatingAn active cluster was found during exclusive startup, restarting to join the clusterCRS-2672: Attempting to start 'ora.mdnsd' on 'lxh4'CRS-2676: Start of 'ora.mdnsd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.gipcd' on 'lxh4'CRS-2676: Start of 'ora.gipcd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.gpnpd' on 'lxh4'CRS-2676: Start of 'ora.gpnpd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.cssdmonitor' on 'lxh4'CRS-2676: Start of 'ora.cssdmonitor' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.cssd' on 'lxh4'CRS-2672: Attempting to start 'ora.diskmon' on 'lxh4'CRS-2676: Start of 'ora.diskmon' on 'lxh4' succeededCRS-2676: Start of 'ora.cssd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.ctssd' on 'lxh4'CRS-2676: Start of 'ora.ctssd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.drivers.acfs' on 'lxh4'CRS-2676: Start of 'ora.drivers.acfs' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.asm' on 'lxh4'CRS-2676: Start of 'ora.asm' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.crsd' on 'lxh4'CRS-2676: Start of 'ora.crsd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.evmd' on 'lxh4'CRS-2676: Start of 'ora.evmd' on 'lxh4' succeededTimed out waiting for the CRS stack to start.
CRSD服务启动失败!
查看日志:
[root@lxh4 crsd]# more crsdOUT.log
2014-07-10 15:03:56
Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:03:56 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:03:58 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:03:58 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:00 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:00 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:02 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:02 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:04 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:04 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:06 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:06 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:08 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:08 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:10 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:10 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:12 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:12 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:14 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:14 CRSD REBOOTCRSD exiting: Could not init OCR, code: 442014-07-10 15:04:16 Changing directory to /u01/11.2.0/grid/log/lxh4/crsd2014-07-10 15:04:16 CRSD REBOOTCRSD exiting: Could not init OCR, code: 44
[root@lxh4 crsd]# tail crsd.log
2014-07-10 15:04:17.954: [ GPnP][3046512336]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=12069, tl=3, f=02014-07-10 15:04:17.966: [ OCRAPI][3046512336]clsu_get_private_ip_addresses: no ip addresses found.2014-07-10 15:04:17.966: [GIPCXCPT][3046512336] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0)2014-07-10 15:04:17.968: [GIPCXCPT][3046512336] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)[ OCRAPI][3046512336]a_init_clsss: failed to call clsu_get_private_ip_addr (7)2014-07-10 15:04:17.970: [ OCRAPI][3046512336]a_init:13!: Clusterware init unsuccessful : [44]2014-07-10 15:04:17.970: [ CRSOCR][3046512336] OCR context init failure. Error: PROC-44: Error in network address and interface operations Network address and interface operations error [7]2014-07-10 15:04:17.970: [ CRSD][3046512336][PANIC] CRSD exiting: Could not init OCR, code: 442014-07-10 15:04:17.970: [ CRSD][3046512336] Done.
一个重要的错误信息"2014-07-10 15:04:17.966: [ OCRAPI][3046512336]clsu_get_private_ip_addresses: no ip addresses found."看来造成crsd启动失败的原因和private network interface 有关系
检查网络配置:
[root@lxh4 crsd]# ifconfig eth2
eth2 Link encap:Ethernet HWaddr 08:00:27:D5:85:37
inet addr:10.10.10.13 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::a00:27ff:fed5:8537/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2213 errors:0 dropped:0 overruns:0 frame:0
TX packets:11568 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:281234 (274.6 KiB) TX bytes:4006863 (3.8 MiB)
[grid@lxh2 ~]$ /sbin/ifconfig eth2
eth2 Link encap:Ethernet HWaddr 08:00:27:AE:93:9A
inet addr:10.10.10.11 Bcast:10.10.10.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:feae:939a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:157629 errors:0 dropped:0 overruns:0 frame:0
TX packets:140367 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:68428231 (65.2 MiB) TX bytes:48684593 (46.4 MiB)
node3的private network interface的netmask 为255.255.255.0,和其他node的netmask(255.255.255.0)不一致!
解决方法:
1、修改node3的private network interface的netmask为255.255.255.0
2、清除crs的配置信息,重新执行root.sh
[root@lxh4 install]# perl rootcrs.pl -deconfig -force
2014-07-10 15:12:13: Parsing the host name2014-07-10 15:12:13: Checking for super user privileges2014-07-10 15:12:13: User has super user privilegesUsing configuration parameter file: ./crsconfig_paramsPRCR-1035 : Failed to look up CRS resource ora.cluster_vip.type for 1PRCR-1068 : Failed to query resourcesCannot communicate with crsdPRCR-1070 : Failed to check if resource ora.gsd is registeredCannot communicate with crsdPRCR-1070 : Failed to check if resource ora.ons is registeredCannot communicate with crsdPRCR-1070 : Failed to check if resource ora.eons is registeredCannot communicate with crsdACFS-9200: SupportedCRS-4535: Cannot communicate with Cluster Ready ServicesCRS-4000: Command Stop failed, or completed with errors.CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'lxh4'CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'lxh4'CRS-2673: Attempting to stop 'ora.ctssd' on 'lxh4'CRS-2673: Attempting to stop 'ora.evmd' on 'lxh4'CRS-2673: Attempting to stop 'ora.asm' on 'lxh4'CRS-2673: Attempting to stop 'ora.mdnsd' on 'lxh4'CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'lxh4'CRS-2677: Stop of 'ora.cssdmonitor' on 'lxh4' succeededCRS-2677: Stop of 'ora.evmd' on 'lxh4' succeededCRS-2677: Stop of 'ora.mdnsd' on 'lxh4' succeededCRS-2677: Stop of 'ora.ctssd' on 'lxh4' succeededCRS-2677: Stop of 'ora.asm' on 'lxh4' succeededCRS-2673: Attempting to stop 'ora.cssd' on 'lxh4'CRS-2677: Stop of 'ora.cssd' on 'lxh4' succeededCRS-2673: Attempting to stop 'ora.gpnpd' on 'lxh4'CRS-2673: Attempting to stop 'ora.diskmon' on 'lxh4'CRS-2677: Stop of 'ora.diskmon' on 'lxh4' succeededCRS-2677: Stop of 'ora.gpnpd' on 'lxh4' succeededCRS-2673: Attempting to stop 'ora.gipcd' on 'lxh4'CRS-2677: Stop of 'ora.drivers.acfs' on 'lxh4' succeededCRS-2677: Stop of 'ora.gipcd' on 'lxh4' succeededCRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'lxh4' has completedCRS-4133: Oracle High Availability Services has been stopped.error: package cvuqdisk is not installedSuccessfully deconfigured Oracle clusterware stack on this node
[root@lxh4 crsd]# /u01/11.2.0/grid/root.sh
Running Oracle 11g root.sh script...The following environment variables are set as: ORACLE_OWNER= grid ORACLE_HOME= /u01/11.2.0/gridEnter the full pathname of the local bin directory: [/usr/local/bin]: The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y Copying dbhome to /usr/local/bin ...The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y Copying oraenv to /usr/local/bin ...The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y Copying coraenv to /usr/local/bin ...Entries will be added to the /etc/oratab file as needed byDatabase Configuration Assistant when a database is createdFinished running generic part of root.sh script.Now product-specific root actions will be performed.2014-07-10 15:17:53: Parsing the host name2014-07-10 15:17:53: Checking for super user privileges2014-07-10 15:17:53: User has super user privilegesUsing configuration parameter file: /u01/11.2.0/grid/crs/install/crsconfig_paramsLOCAL ADD MODE Creating OCR keys for user 'root', privgrp 'root'..Operation successful.Adding daemon to inittabCRS-4123: Oracle High Availability Services has been started.ohasd is startingCRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node lxh2, number 1, and is terminatingAn active cluster was found during exclusive startup, restarting to join the clusterCRS-2672: Attempting to start 'ora.mdnsd' on 'lxh4'CRS-2676: Start of 'ora.mdnsd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.gipcd' on 'lxh4'CRS-2676: Start of 'ora.gipcd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.gpnpd' on 'lxh4'CRS-2676: Start of 'ora.gpnpd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.cssdmonitor' on 'lxh4'CRS-2676: Start of 'ora.cssdmonitor' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.cssd' on 'lxh4'CRS-2672: Attempting to start 'ora.diskmon' on 'lxh4'CRS-2676: Start of 'ora.diskmon' on 'lxh4' succeededCRS-2676: Start of 'ora.cssd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.ctssd' on 'lxh4'CRS-2676: Start of 'ora.ctssd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.drivers.acfs' on 'lxh4'CRS-2676: Start of 'ora.drivers.acfs' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.asm' on 'lxh4'CRS-2676: Start of 'ora.asm' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.crsd' on 'lxh4'CRS-2676: Start of 'ora.crsd' on 'lxh4' succeededCRS-2672: Attempting to start 'ora.evmd' on 'lxh4'CRS-2676: Start of 'ora.evmd' on 'lxh4' succeededclscfg: EXISTING configuration version 5 detected.clscfg: version 5 is 11g Release 2.Successfully accumulated necessary OCR keys.Creating OCR keys for user 'root', privgrp 'root'..Operation successful.lxh4 2014/07/10 15:20:10 /u01/11.2.0/grid/cdata/lxh4/backup_20140710_152010.olrPreparing packages for installation...cvuqdisk-1.0.7-1Configure Oracle Grid Infrastructure for a Cluster ... succeededUpdating inventory properties for clusterwareStarting Oracle Universal Installer...Checking swap space: must be greater than 500 MB. Actual 2047 MB PassedThe inventory pointer is located at /etc/oraInst.locThe inventory is located at /u01/app/oraInventory'UpdateNodeList' was successful.
脚本执行成功!
验证:
[root@lxh4 crsd]# crsctl check crs
CRS-4638: Oracle High Availability Services is onlineCRS-4537: Cluster Ready Services is onlineCRS-4529: Cluster Synchronization Services is onlineCRS-4533: Event Manager is online
[root@lxh4 crsd]# crs_stat -t
Name Type Target State Host ------------------------------------------------------------ora.DG1.dg ora....up.type ONLINE ONLINE lxh2 ora.DG2.dg ora....up.type ONLINE ONLINE lxh2 ora....ER.lsnr ora....er.type ONLINE ONLINE lxh3 ora....N1.lsnr ora....er.type ONLINE ONLINE lxh2 ora....VOTE.dg ora....up.type ONLINE ONLINE lxh2 ora.RCY.dg ora....up.type ONLINE ONLINE lxh2 ora.asm ora.asm.type ONLINE ONLINE lxh2 ora.eons ora.eons.type ONLINE ONLINE lxh2 ora.gsd ora.gsd.type ONLINE ONLINE lxh2 ora.lxh.db ora....se.type OFFLINE OFFLINE ora....taf.svc ora....ce.type OFFLINE OFFLINE ora....SM1.asm application ONLINE ONLINE lxh2 ora....H1.lsnr application ONLINE OFFLINE ora.lxh2.gsd application ONLINE ONLINE lxh2 ora.lxh2.ons application ONLINE OFFLINE ora.lxh2.vip ora....t1.type ONLINE ONLINE lxh3 ora....SM2.asm application ONLINE ONLINE lxh3 ora....H2.lsnr application ONLINE ONLINE lxh3 ora.lxh3.gsd application ONLINE ONLINE lxh3 ora.lxh3.ons application ONLINE OFFLINE ora.lxh3.vip ora....t1.type ONLINE ONLINE lxh3 ora....SM3.asm application ONLINE ONLINE lxh4 ora....H3.lsnr application ONLINE ONLINE lxh4 ora.lxh4.gsd application ONLINE ONLINE lxh4 ora.lxh4.ons application ONLINE ONLINE lxh4 ora.lxh4.vip ora....t1.type ONLINE ONLINE lxh4 ora....network ora....rk.type ONLINE ONLINE lxh2 ora.oc4j ora.oc4j.type ONLINE ONLINE lxh2 ora.ons ora.ons.type ONLINE ONLINE lxh4 ora....ry.acfs ora....fs.type ONLINE ONLINE lxh2 ora.scan1.vip ora....ip.type ONLINE ONLINE lxh2
@至此,问题解决完毕!