aix hacmp操作笔记.docx
DESCRIPTION
AIX HACMP操作笔记.docxTRANSCRIPT
AIX HACMP 操作笔记 201101 分享 tomroom.cublog.cn 作者:tomroom 环保男
<!--[if !supportLists]--> <!--[endif]-->如何检查AIX HA 子系统的状态方法
<!--[if !supportLists]--> <!--[endif]-->启动 HACMP 方法
<!--[if !supportLists]--> <!--[endif]-->关闭HACMP方法
<!--[if !supportLists]--> <!--[endif]-->hacmp的 log文件位置
<!--[if !supportLists]--> <!--[endif]-->操作两台做HA 的AIX操作建议
<!--[if !supportLists]--> <!--[endif]-->查看 hacmp 状态方法
<!--[if !supportLists]--> <!--[endif]-->判断资源组在那个服务器上方法
<!--[if !supportLists]--> <!--[endif]-->AIX ha move 一个 res去另外一个节点操作
<!--[if !supportLists]--> <!--[endif]-->Bring a Resource Group Online, Bring a Resource Group
Offline 和 Move a Resource Group to Another Node / Site区别:
操作环境 AIX OS 6.1
下面是相关详细内容
<!--[if !supportLists]--> <!--[endif]-->如何检查AIX HA 子系统的状态方法
# lssrc -s clstrmgrES
Subsystem Group PID Status
clstrmgrES cluster 4326122 active
# set -o vi
# lssrc -ls clstrmgrES
Current state: ST_STABLE
sccsid = "@(#)36 1.135.5.2 src/43haes/usr/sbin/cluster/hacmprd/main.C, hacmp.pe, 53haes_r550, 0934B_hacmp550 8/8/09 14:48:23"
i_local_nodeid 0, i_local_siteid -1, my_handle 1
ml_idx[1]=0
There are 0 events on the Ibcast queue
There are 0 events on the RM Ibcast queue
CLversion: 10
local node vrmf is 5506
cluster fix level is "0"
The following timer(s) are currently active:
Current DNP values
DNP Values for NodeId - 0 NodeName - sc1prrhas01
PgSpFree = 0 PvPctBusy = 0 PctTotalTimeIdle = 0.000000
DNP Values for NodeId - 0 NodeName - sc1prrhas02
PgSpFree = 0 PvPctBusy = 0 PctTotalTimeIdle = 0.000000
Current state: ST_STABLE 表明 CLUSTER 已经正常
Current state: ST_BARRIER 表明 正在启动操作中
Current state: ST_INIT 两边 cluster 都停止的时候,虽然 subsystem
启动但是状态是这个
<!--[if !supportLists]--> <!--[endif]-->启动 HACMP 方法
smit clstart 启动 HA
<!--[if !supportLists]--> <!--[endif]-->关闭HACMP方法,smitty clstop 停止Cluster 操作(要逐台操作,比如
P1,P2两台AIX做 cluster要确认等 P1的HACMP关闭之后才能,去关闭 P2的HACMP)
<!--[if !supportLists]--> <!--[endif]-->hacmp的 log文件位置
# pwd
/var/hacmp/log
# ls hacmp.out
hacmp.out
该目录下如下是系统自动备份的文件
hacmp.out.1
hacmp.out.2
hacmp.out.3
hacmp.out.4
hacmp.out.5
hacmp.out.6
hacmp.out.7
<!--[if !supportLists]--> <!--[endif]-->操作两台做HA 的AIX操作建议:推荐开 4个窗口 2个用 tail –f 命令查看
hacmp log 另外 2个窗口执行命令
<!--[if !supportLists]--> <!--[endif]-->查看 hacmp 状态方法 运行 smit hacmp选择
Problem Determination Tools
View Current State
HACMP for AIX
Move cursor to desired item and press Enter.
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
Problem Determination Tools
Move cursor to desired item and press Enter.
HACMP Verification
View Current State
HACMP Log Viewing and Management
Recover From HACMP Script Failure
Restore HACMP Configuration Database from Active Configurat
Release Locks Set By Dynamic Reconfiguration
Clear SSA Disk Fence Registers
HACMP Cluster Test Tool
HACMP Trace Facility
HACMP Error Notification
Manage RSCT Services
Open a SMIT Session on a Node
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear below.
[TOP]
Obtaining information via SNMP from Node: sc1prrhas02...
_____________________________________________________________________________
Cluster Name: SAP_p1p2_cluster
Cluster State: UP
Cluster Substate: STABLE
_____________________________________________________________________________
Node Name: sc1prrhas01 State: UP
Network Name: net_diskhb_01 State: UP
Address: Label: heartbeatp1 State: UP
Network Name: net_ether_01 State: UP
Address: 100.0.0.1 Label: sc1prrhas01_boot1 State: UP
<!--[if !supportLists]--> <!--[endif]-->判断资源组在那个服务器上方法,如下黑体 sc1prrhas02 ONLINE 表明
res 在服务 P2 上
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear below.
[MORE...49]
Fallover Policy: Fallover To Next Priority Node In The List
Fallback Policy: Never Fallback
Site Policy: ignore
Node Group State
---------------------------- ---------------
sc1prrhas01 OFFLINE
sc1prrhas02 ONLINE
Resource Group Name: sc1prrvglob_res
Startup Policy: Online On Home Node Only
Fallover Policy: Fallover To Next Priority Node In The List
Fallback Policy: Never Fallback
Site Policy: ignore
Node Group State
---------------------------- ---------------
sc1prrhas01 OFFLINE
sc1prrhas02 ONLINE
<!--[if !supportLists]--> <!--[endif]-->AIX HA里使用的BOOT IP ,下面是公司其中一台AIX系统定义的BOOT IP
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear belo
[MORE...16]
Address: 100.0.0.1 Label: sc1prrhas01_boot1 State: UP
Address: 100.0.1.1 Label: sc1prrhas01_boot2 State: UP
Address: 192.168.51.120 Label: sc1prrvdbin State: UP
Address: 192.168.51.121 Label: sc1prrvascs State: UP
Address: 192.168.51.122 Label: sc1prrvglob State: UP
Node Name: sc1prrhas02 State: DOWN
Network Name: net_diskhb_01 State: DOWN
Network Name: net_ether_01 State: DOWN
Address: 100.0.0.2 Label: sc1prrhas02_boot1 State: DOWN
Address: 100.0.1.2 Label: sc1prrhas02_boot2 State: DOWN
上面Address: 100.0.0.1 Label: sc1prrhas01_boot1 就是 P1服务器
boot IP 1
<!--[if !vml]-->
<!--[endif]-->
公司
P1,P2两台AIX做HA其中 3个VIRTUAL Hosts 在 ha里对应 3个 res
group,如下图里 ECC Virtual Hosts
<!--[if !vml]-->
<!--[endif]-->
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear below.
[MORE...32]
Address: 100.0.1.2 Label: sc1prrhas02_boot2 State: UP
Cluster Name: SAP_p1p2_cluster
Resource Group Name: sc1prrvdbin_res
Startup Policy: Online On Home Node Only
Fallover Policy: Fallover To Next Priority Node In The List
Fallback Policy: Never Fallback
Site Policy: ignore
Node Group State
---------------------------- ---------------
sc1prrhas01 ONLINE
sc1prrhas02 OFFLINE
Resource Group Name: sc1prrvascs_res
Startup Policy: Online On Home Node Only
Fallover Policy: Fallover To Next Priority Node In The List
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear below.
[MORE...48]
Startup Policy: Online On Home Node Only
Fallover Policy: Fallover To Next Priority Node In The List
Fallback Policy: Never Fallback
Site Policy: ignore
Node Group State
---------------------------- ---------------
sc1prrhas01 ONLINE
sc1prrhas02 OFFLINE
Resource Group Name: sc1prrvglob_res
Startup Policy: Online On Home Node Only
Fallover Policy: Fallover To Next Priority Node In The List
Fallback Policy: Never Fallback
Site Policy: ignore
Node Group State
---------------------------- ---------------
sc1prrhas01 ONLINE
sc1prrhas02 OFFLINE
P1,P2上 3个 res ,如果只切换其中一个 res 可以在上面按 res查看 那个节点
该 res状态 是 online还是 offline
<!--[if !supportLists]--> <!--[endif]-->HA的验证 smit hacmp
HACMP Verification 推荐在 ha 停下的时候运行,验证 ha
Problem Determination Tools
Move cursor to desired item and press Enter.
HACMP Verification
View Current State
HACMP Log Viewing and Management
Recover From HACMP Script Failure
Restore HACMP Configuration Database from Active Configuration
Release Locks Set By Dynamic Reconfiguration
Clear SSA Disk Fence Registers
HACMP Cluster Test Tool
HACMP Trace Facility
HACMP Error Notification
Manage RSCT Services
Open a SMIT Session on a Node
<!--[if !supportLists]--> <!--[endif]-->在HA切换一半报错时候可以运行Recover From HACMP Script Failure让
ha强制忽略报错地方,跳过继续执行。
Problem Determination Tools
Move cursor to desired item and press Enter.
HACMP Verification
View Current State
HACMP Log Viewing and Management
Recover From HACMP Script Failure
Restore HACMP Configuration Database from Active Configuration
Release Locks Set By Dynamic Reconfiguration
Clear SSA Disk Fence Registers
HACMP Cluster Test Tool
HACMP Trace Facility
HACMP Error Notification
Manage RSCT Services
Open a SMIT Session on a Node
<!--[if !supportLists]--> <!--[endif]-->AIX ha move 一个 res去另外一个节点(ha之前都已经配置并测试好)
运行 smit hacmp选菜单:
System Management (C-SPOC)
HACMP Resource Group and Application Management
Move a Resource Group to Another Node / Site
Move Resource Groups to Another Node
再选中 res 进行操作
上面move res操作不限制,在任何 cluter上的节点上都能操作(需要 team
人沟通同时只有一个人进行该操作,避免互相干扰)
HACMP for AIX
Move cursor to desired item and press Enter.
Initialization and Standard Configuration
Extended Configuration
System Management (C-SPOC)
Problem Determination Tools
System Management (C-SPOC)
Move cursor to desired item and press Enter.
Manage HACMP Services
HACMP Communication Interface Management
HACMP Resource Group and Application Management
HACMP Log Viewing and Management
HACMP File Collection Management
HACMP Security and Users Management
HACMP Logical Volume Management
HACMP Concurrent Logical Volume Management
HACMP Physical Volume Management
Configure GPFS
Open a SMIT Session on a Node
HACMP Resource Group and Application Management
Move cursor to desired item and press Enter.
Show the Current State of Applications and Resource Groups
Bring a Resource Group Online
Bring a Resource Group Offline
Move a Resource Group to Another Node / Site
Suspend/Resume Application Monitoring
Application Availability Analysis
Move a Resource Group to Another Node / Site
Move cursor to desired item and press Enter.
Move Resource Groups to Another Node
Move Resource Groups to Another Site
下面选择一个其中 res ,选择之后按执行如下 会让选择 Select a
Destination Node由于公司 P1,P2 HA 就两个节点,当下选择另外一个服务
节点名 sc1prrhas02没有其他选项,后面画面,显示 res名和节点名,再按确
认才开始运行
Move a Resource Group to Another Node / Site
Move cursor to desired item and press Enter.
Move Resource Groups to Another Node
Move Resource Groups to Another Site
+--------------------------------------------------------------------------+
| Select a Destination Node |
| |
| Move cursor to desired item and press Enter. |
| |
| # *Denotes Originally Configured Highest Priority Node |
| sc1prrhas02 |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1=H| /=Find n=Find Next |
F9=S+--------------------------------------------------------------------------+
运行完成之后如下显示,按翻页键会 log里显示这个 cluster里所有 res的当
前在服务器上状态
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear below.
[TOP]
Attempting to move resource group sc1prrvascs_res to node sc1prrhas02.
Waiting for the cluster to process the resource group movement request....
Waiting for the cluster to stabilize.................
Resource group movement successful.
Resource group sc1prrvascs_res is online on node sc1prrhas02.
Cluster Name: SAP_p1p2_cluster
Resource Group Name: sc1prrvdbin_res
Node State
---------------------------- ---------------
sc1prrhas01 OFFLINE
sc1prrhas02 ONLINE
[MORE...13]
<!--[if !supportLists]--> <!--[endif]-->公司HA软件切换的脚本所在的路径/etc/hacmp下
# pwd
/etc/hacmp
# ls
startsc1prrvascs.sh startsc1prrvglob.sh stopsc1prrvdbin.sh
startsc1prrvdbin.sh stopsc1prrvascs.sh stopsc1prrvglob.sh
<!--[if !supportLists]--> <!--[endif]-->Bring a Resource Group Online, Bring a Resource Group
Offline 和 Move a Resource Group to Another Node / Site区别:
move res总在 ha的一个 node里 online 而Bring 一个 res offline吧 res 在
所有节点上 offline,比如 res是 vg的话,bring res offline后 res里指定 vg
在所有 ha中服务器节点都 varryoff不可用
HACMP Resource Group and Application Management
Move cursor to desired item and press Enter.
Show the Current State of Applications and Resource Groups
Bring a Resource Group Online
Bring a Resource Group Offline
Move a Resource Group to Another Node / Site
Suspend/Resume Application Monitoring
Application Availability Analysis
上面的
Suspend/Resume Application Monitoring
Application Availability Analysis
公司没有使用这个功能