Checkpoint防火墙ClusterXL 故障之FIB Problem问题解决
Checkpoint防火墙ClusterXL 故障之FIB Problem问题解决
办公网有两台CheckPoint防火墙做cluster的HA主备模式,Custer-HA出现故障现象如下(其中一台CP-248状态为down,一边CP-246为active),导致CP-246和CP-248的cluster的HA准备切换不成功。
[NJZQ-CP-248]# cphaprob stat
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
1 19.19.19.246 100% Active
2 (local) 19.19.19.248 0% Down
[NJZQ-CP-248]# cphaprob list //该命令非常有用,用于查找出CP防火墙cluster的监控的关键组件(cp称为Device)
Built-in Devices:
Device Name: Interface Active Check
Current state: OK
Registered Devices:
Device Name: Synchronization
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 705.3 sec
Device Name: Filter
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 699.2 sec
Device Name: cphad
Registration number: 2
Timeout: 2 sec
Current state: OK
Time since last report: 0.6 sec
Device Name: fwd
Registration number: 3
Timeout: 2 sec
Current state: OK
Time since last report: 0.4 sec
Device Name: FIB
Registration number: 4
Timeout: none
Current state: problem
Time since last report: 1 sec
对应的CP-246的显示如下:
[NJZQ-CP-246]# cphaprob stat
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
1 (local) 19.19.19.246 100% Active
2 19.19.19.248 0% Down
并且发现对应的CP-246的cphaprob list显示并无异常,均为OK。
[Expert@NJZQ-CP-246]# cphaprob list
Built-in Devices:
Device Name: Interface Active Check
Current state: OK
Registered Devices:
Device Name: Synchronization
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 3077.4 sec
Device Name: Filter
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 3071.4 sec
Device Name: cphad
Registration number: 2
Timeout: 2 sec
Current state: OK
Time since last report: 0.2 sec
Device Name: fwd
Registration number: 3
Timeout: 2 sec
Current state: OK
Time since last report: 0.8 sec
发现以上的故障现象后,对CP-248的clusterXL进行重启如下:
[NJZQ-CP-248]# expert
Enter expert password:
You are in expert mode now.
[Expert@NJZQ-CP-248]# clusterXL_admin down
Setting member to administratively downstate ...
Member current state is Down
[Expert@NJZQ-CP-248]# clusterXL_admin up
Setting member to normal operation ...
Member current state is Down
Operation failed: member is still down, run 'cphaproblist' for further details
重启后,仍然不成功。
从网上找到解决方法:比较两台fw的cpconfig配置条目发现:
[NJZQ-CP-246]# expert
Enter expert password:
You are in expert mode now.
[Expert@NJZQ-CP-246]# cpconfig
This program will let you re-configure
your Check Point products configuration.
Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable cluster membership for this gateway
(7) Configure Check Point CoreXL
(8) Automatic start of Check Point Products
(9) Exit
Enter your choice (1-9) :
[NJZQ-CP-248]# expert
Enter expert password:
You are in expert mode now.
[Expert@NJZQ-CP-248]# cpconfig
This program will let you re-configure
your Check Point products configuration.
Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Disable Advanced Routing //注意到该部分为此防火墙和CP-246防火墙不一致的地方,且当前已经处于开启状态。
(7) Disable cluster membership for this gateway
(8) Configure Check Point CoreXL
(9) Automatic start of Check Point Products
(10) Exit
Enter your choice (1-10) :6 //这里选择6,回车,将Advanced Routing 功能disable掉。
Disable Advanced Routing...
============================
You have selected to disable advancedrouting.
Areyou sure? (y/n) [y] ? y //输入y
In order to accomplish the action, CheckPoint services should be restarted.
Restart now ? (y/n) [y] ? y //输入y,下面显示CP的服务重启过程。
Advanced Routing Suite is now stopped
Stopping SmartView Monitor daemon ...
SmartView Monitor daemon is not running
Stopping SmartView Monitor kernel ...
Driver is Down.
rtmstop: SmartView Monitor kernel is notloaded
FloodGate-1 is already stopped.
×××-1/FW-1 stopped
SVN Foundation: cpd stopped
SVN Foundation: cpWatchDog stopped
SVN Foundation stopped
cpstart: Power-Up self tests passedsuccessfully
cpstart: Starting product - SVN Foundation
SVN Foundation: Starting cpWatchDog
SVN Foundation: Starting cpd
SVN Foundation started
cpstart: Starting product - ×××-1
FireWall-1: starting external ××× module --OK
FireWall-1: Starting fwd
Installing Security PolicyOffice-Cluster-Policy on all.all@NJZQ-CP-248
Fetching Security Policy from localhostsucceeded
Fetching Security Policy From:221.226.154.195 192.168.200.173
Local Policy is Up-To-Date.
ThePolicy was not installed because it is the same as the Policy already on theModule.
FireWall-1: enabling bridge forwarding
FireWall-1 started
cpstart: Starting product - FloodGate-1
FloodGate-1 is disabled. If you wish tostart the service, please run 'etmstart enable'.
cpstart: Starting product - SmartViewMonitor
SmartView Monitor: Not active
cpstart: Starting product - AdvancedRouting
Advanced Routing is not enabled. Please use'cpconfig' to enable it.
Advanced Routing was successfully disabled
Configuration Options:
----------------------
(1) Licenses and contracts
(2) SNMP Extension
(3) PKCS#11 Token
(4) Random Pool
(5) Secure Internal Communication
(6) Enable Advanced Routing
(7) Disable cluster membership for this gateway
(8) Configure Check Point CoreXL
(9) Automatic start of Check Point Products
(10) Exit
CP-248重启后,查看cluster的状态,立即恢复了正常。
[Expert@NJZQ-CP-248]# cphaprob stat
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
1 221.226.154.195 100% Active
2 (local) 19.19.19.248 0% Standby
[Expert@NJZQ-CP-248]#
查看CP-246,查看cluster状态如下:
[Expert@NJZQ-CP-246]# cphaprob stat
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
1 (local) 19.19.19.246 100% Active
2 19.19.19.248 0% Standby
[Expert@NJZQ-CP-246]#
至此,两台CP防火墙的Cluster已经成功,主备倒换正常。