hacmpdoc

Upload: omkar

Post on 05-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 HACMPDOC

    1/18

    What is an LVM ?

    LVM stands forLogical DiskManager which is the fundamental way to manage UNIX/Linux storagesystems in a scalable manner. An LVM abstracts disk devices into pools of storage space calledVolume Groups. These volume groups are in turn subdivided into virtual disks called LogicalVolumes. The logical volumes may be used just like regular disks with filesystem created on themand mounted in the Unix/Linux filesystem tree. The logical volumes can span multiple disks. Even

    though a lot of companies have implemented their own LVM's for *nixes, the one created by OpenSoftware Foundation (OSF) was integrated into many Unix systems which serves as a base for theLinux implementation of LVM.LVM CreationTo create a LVM, we follow a three step process.

    Step One : We need to select the physical storage resources that are going to be used for LVM.Typically, these are standard partitions but can also be Linux software RAID volumes that we'vecreated. In LVM terminology, these storage resources are called "physical volumes" (eg: /dev/hda1,/dev/hda2 ... etc).Our first step in setting up LVM involves properly initializing these partitions so that they can be

    recognized by the LVM system. This involves setting the correct partition type (usually using thefdisk command, and entering the type of partition as 'Linux LVM' - 0x8e ) if we're adding a physical

    partition; and then running the pvcreate command.

    # pvcreate /dev/hda1 /dev/hda2 /dev/hda3# pvscan

    Step Two :Creating a volume group. You can think of a volume group as a pool of storage thatconsists of one or more physical volumes. While LVM is running, we can add physical volumes to thevolume group or even remove them.First initialize the /etc/lvmtab and /etc/lvmtab.d files by running the following command:

    # vgscan

    Now you can create a volume group and assign one or more physical volumes to the volume group.# vgcreate my_vol_grp /dev/hda1 /dev/hda2

    Behind the scenes, the LVM system allocates storage in equal-sized "chunks", called extents. We canspecify the particular extent size to use at volume group creation time. The size of an extent defaultsto 4Mb, which is perfect for most uses.You can use the -s flag to change the size of the extent. Theextent affects the minimum size of changes which can be made to a logical volume in the volumegroup, and the maximum size of logical and physical volumes in the volume group. A logical volume

    can contain at most 65534 extents, so the default extent size (4 MB) limits the volume to about 256GB; a size of 1 TB would require extents of atleast 16 MB. So to accomodate a 1 TB size, the abovecommand can be rewriten as :

    # vgcreate -s 16M my_vol_grp /dev/hda1 /dev/hda2

    You can check the result of your work at this stage by entering the command:# vgdisplay

    This command displays the total physical extends in a volume group, size of each extent, the allocatedsize and so on.

    Step Three :This step involves the creation of one or more "logical volumes" using our volumegroup storage pool. The logical volumes are created from volume groups, and may have arbitarynames. The size of the new volume may be requested in either extents (-l switch) or in KB, MB, GB

  • 8/2/2019 HACMPDOC

    2/18

    or TB ( -L switch) rounding up to whole extents.

    # lvcreate -l 50 -n my_logical_vol my_vol_grp

    The above command allocates 50 extents of space in my_vol_grp to the newly createdmy_logical_vol. The -n switch specifies the name of the logical volume we are creating.

    Now you can check if you got the desired results by using the command :# lvdisplay

    which shows the information of your newly created logical volume.Once a logical volume is created, we can go ahead and put a filesystem on it, mount it, and start usingthe volume to store our files. For creating a filesystem, we do the following:

    # mke2fs -j /dev/my_vol_grp/my_logical_vol

    The -j signifies journaling support for the ext3 filesystem we are creating.

    Mount the newly created file system :

    # mount /dev/my_vol_grp/my_logical_vol /data

    Also do not forget to append the corresponding line in the /etc/fstab file:

    #File: /etc/fstab

    /dev/my_vol_grp/my_logical_vol /data ext3 defaults 0 0

    Q. What characters should a hostname contain for HACMP configuration?

    A. The hostname cannot have following characters: -, _, * or other special characters.

    Q. Can Service IP and Boot IP be in same subnet?

    A. No.

    The service IP address and Boot IP address cannot be in same subnet. This is the basicrequirement for HACMP cluster configuration. The verification process does not allow the IPaddresses to be in same subnet and cluster will not start.Q. Can multiple Service IP addresses be configured on single Ethernet cards?

    A. Yes.

    Using SMIT menu, it can be configured to have multiple Service IP addresses running onsingle Ethernet card. It only requires selecting same network name for specific Service IP addresses inSMIT menu.Q. What happens when a NIC having Service IP goes down?

    A. When a NIC card running the Service IP address goes down, the HACMP detects the failure andfails over the service IP address to available standby NIC on same node or to another node in thecluster.

    Q. Can Multiple Oracle Database instances be configured on single node of HACMP cluster?

    A. Yes. Multiple Database instances can be configured on single node of HACMP cluster. For this oneneeds to have separate Service IP addresses over which the listeners for every Oracle Database willrun. Hence one can have separate Resource groups which will own each Oracle instance. This

  • 8/2/2019 HACMPDOC

    3/18

    configuration will be useful if there is a failure of single Oracle Database instance on one node to befailed over to another node without disturbing other running Oracle instances.

    Q. Can HACMP be configured in Active - Passive configuration?

    A. Yes. For Active - In Passive cluster configuration, do not configure any Service IP on the passivenode. Also for all the resource groups on the Active node please specify the passive node as the next

    node in the priority to take over in the event of failure of active node.

    Q. Can file system mounted over NFS protocol be used for Disk Heartbeat?

    A. No. The Volume mounted over NFS protocol is a file system for AIX, and since disk device isrequired for Enhanced concurrent capable volume group for disk heartbeat the NFS file system cannot

    be used for configuring the disk heartbeat. One needs to provide disk device to AIX hosts over FCP oriSCSI protocol.

    Q. Which are the HACMP log files available for troubleshooting?

    A. Following are log files which can be used for troubleshooting:1. /var/hacmp/clverify/current/

    2. /var/hacmp/clverify/pass/3. /var/hacmp/clverify/fail/4. /tmp/hacmp.out file records the output generated by the event scripts of HACMP as they execute.5. /tmp/clstmgr.debug file contains time-stamped messages generated by HACMP clstrmgrES activity.6. /tmp/cspoc.log file contains messages generated by HACMP C-SPOC commands.7. /usr/es/adm/cluster.log file is the main HACMP log file. HACMP error messages and messagesabout HACMP related events are appended to this log.8. /var/adm/clavan.log file keeps track of when each application that is managed by HACMP is startedor stopped and when the node stops on which an application is running.9. /var/hacmp/clcomd/clcomd.log file contains messages generated by HACMP clustercommunication daemon.10. /var/ha/log/grpsvcs.11. /var/ha/log/topsvcs.12. /var/ha/log/grpglsm file tracks the execution of internal activities of grpglsm daemon.

    Q. What is high availability all about?

    A. High availability is about making an application highly available. It isn't about making thehardware highly available. I've never met a user who really cared if the server was running. On theother hand, I've met a LOT of users who care if the application(s) running on the server is/arerunning. Focus on making the application highly available and use the hardware merely as a tool toachieve that goal.

    I've taught a fair number of HACMP courses for IBM. I've found that a lot of studentsstruggle with the basic HACMP concepts until they realize that high availability is about making anapplication highly available. Once they grasp this concept, the rest of the HACMP concepts start tolook pretty obvious.

    Q. What does HACMP stand for?

    A. HACMP is an abbreviation for High Availability Cluster Multi-Processing.The High Availability part refers to the HACMP features which enable one to build a cluster ofmultiple nodes (i.e. multiple IBM RS/6000's) which work together to provide a highly availableapplication.

    The Cluster Multi-Processing part refers to the HACMP features which enable one to build a

    cluster of multiple nodes which work together to provide improved application performance (i.e.parallel processing).

    Note that the two feature sets overlap and can be used together to build a cluster of multiple nodes

  • 8/2/2019 HACMPDOC

    4/18

    which work together to provide improved application performance and high availability.

    Q. Do I need a highly available cluster for my application?

    A. The short answer is (probably): if you aren't sure if you need HACMP then you don't needHACMP.The medium answer is: If you don't know if you need a highly available cluster for your application

    then you almost certainly don't (or you really don't understand your application's requirements inwhich case, you're not ready to put your application into a highly available cluster).

    The long answer (known as Jose's Law in honour of the person who came up with it) is:

    1. Ask your manager if the application is business critical.a. If your manager says yes, investigate the tolerable downtime and economic loss of unavailability

    (you need to justify the cost of setting up an HA cluster).b. If the answer is no, ask his permission to investigate deeper.

    2. In either case, investigate further by first powering down the server. Wait for the calls. Say you are

    fixing it. Count the calls. Ask if it's critical. Ask how critical it is.a. If nobody who matters complains then format the disks and install an MP3 & Quake server.b. If the house is on fire then go back to your manager. You need HACMP and you now have the

    business case to support the need.If you don't like any of these answers, how about contributing a better one.

    Q. What's the practical difference between a rotating resource group and a cascading resource

    group with the (new with HACMP 4.4) cascading without fallback option enabled?

    A. Short answer: They're quite similar.

    Medium answer: In practical terms, the two are roughly equivalent. Use the one which bestrepresents how you intend to manage the resource group:* create it as a rotating resource group if you intend to leave the resource group on whichever node ithappens to be running on today for an extended period of time (i.e. you're treating it like a rotatingresource group).* create it as a cascading without fallback resource group if you intend to normally run the

    application on the primary node (i.e. you intend to move it back to the primary node at the earliestappropriate and/or convenient opportunity).

    There are, of course, other differences including that NFS exports, NFS cross mounts and such aresupported only in cascading resource groups.Long answer: Although the medium answer is (arguably) correct, there's a subtle difference between

    a rotating resource group and a cascading without fallback resource group that's worth considering:* when the current node in a rotating resource group fails, the resource group is moved to the nextnode in the rotation. The service IP address in the resource group replaces the boot adapter's IPaddress on the takeover node.* when the current node in a cascading resource group fails, the resource group moves to the next

    lower priority node in the resource group. The service IP address (if configured) replaces the IPaddress of one of the takeover node's standby adapters.

    At first glance, this may appear to be a meaningless difference but the difference can be significant. Ifthe takeover node's standby adapter(s) have already been used to recover from adapter failures on thetakeover node then the fallover of a cascading resource group will fail due to the lack of an available

    standby adapter. On the other hand, a rotating resource group is less likely to run into this kind oftrouble since the takeover node's boot adapter will tend to remain available since HACMP on thetakeover node will do a swap adapter if the boot adapter's physical network interface dies.

  • 8/2/2019 HACMPDOC

    5/18

    Another way of looking at this is that any failure, even an apparently innocuous failure of astandby adapter on a backup node, needs to be taken seriously and fixed promptly.

    Yet another way of looking at this is that one needs to make sure that takeover nodes haveenough standby adapters to provide the level of redundancy that is required to provide the appropriatelevel of availability.

    Q. Do I need a serial network in my cluster?A. Short answer: yes.

    Medium answer: failure to configure a serial network in your cluster will result in a cluster whichhas a single point of failure. A missing or improperly configured serial network will also increase thelikelihood of getting a partitioned cluster (trust me - you don't want to get one of these as a potentialconsequence is really quite nasty data corruption problems!).Long answer: failure to configure a serial network could result in unnecessary failovers resultingfrom:

    * the loss of the IP network (for example, switch death) resulting in each cluster node decidingthat all other cluster nodes are dead and trying to take over applications which are still running on theother cluster nodes (you do NOT want this to happen).

    Note that that this scenario is pretty much the worst-case example of a partitioned cluster.Slightly less spectacular but still very unpleasant partitioned cluster scenarios can occur (in clusterswithout properly configured serial networks) if network component failure results in groups of clusternodes that can communicate within the group but not between the groups.

    * bursts of high packet traffic on the IP networks resulting in the loss of a series of heartbeatpackets

    * failure of IP on a node resulting in other nodes deciding that the node with failed IP is downwith the same result as the previous point.

    * %%% Are there any others that are worth mentioning? %%%If your cluster uses SSA disks, seriously consider using TMSSA (Target Mode SSA) to

    implement your serial network. In addition to being a robust serial network, this has the side-effect ofcausing HACMP to monitor your SSA loops since a failure of a TMSSA network (which HACMPwill report) is, by definition, a serious problem w.r.t. your shared disks.

    If you implement TMSSA, be sure to test it carefully (you'll need to sever all SSA connectionsbetween the hosts in order to cause a failure).

    Ideally, every node in the cluster should be directly connected to every other node in the clustervia a serial network (one serial network for each pair of nodes). This rapidly becomes impractical asthe number of nodes in the network gets larger. As a bare minimum, think of every node in the clusteras a router and ensure that every node in the cluster has a (conceptual) path to every other node in thecluster with said path only using serial networks. For example, a five node cluster with nodes A, B, C,D and E could meet this criteria with four serial networks - A to B, B to C, C to D and D to E. A

    somewhat better configuration would be to add a fifth network connecting E to A since this ensuresthat all surviving nodes have a (conceptual) "serial network path" to all other nodes even if one of thenodes fails.

    Q. Should I use SCSI disks as shared disks in an HACMP cluster?

    A. Short answer: only if you absolutely have to.

    Medium answer: your cluster will almost certainly failover faster if you use SSA disks or some otherdisk technology which doesn't suffer from ghost disks. Also, the inability to connect or disconnect(without powering down) boxes which are cabled together using SCSI cables can result in longeroutages when dealing with certain disk enclosure related failures. Long answer: I've recently had experience building a cluster that uses a pair of IBM 2104-DL1

    SCSI enclosures to provide shared disk storage. We've been unable to avoid getting ghost disks whena cluster node is rebooted at a point in time when the other node has the disks varyed online. Sincethere are 24 shared disks in the cluster, dealing with the ghost disks results in failover times of about

  • 8/2/2019 HACMPDOC

    6/18

    ten minutes for the cluster.The other side of the coin is that using the 2104-DL1's saved the customer over $75,000 CAD.

    Whether or not the reduced failover time that one would experience with SSA disks is worth $75,000CAD is a question that only the customer can answer.

    Some additional points to ponder:* It should be kept in mind that most if not all SCSI subsystems aren't supported by HACMP

    if you're using concurrent resource groups.* I've been told that TMSCSI (Target Mode SCSI) is a bad idea because it significantly

    impacts the SCSI bus's performance due to the overhead involved in turning around the SCSI bus foreach round-trip. I've not done any research, measurements or testing to see if what I've been told iscorrect. I'd suggest that if you must use shared SCSI disks then you should either avoid TMSCSI ordo some VERY careful testing and measuring before you conclude that it's safe to use. Note thatTMSSA is a completely different fish - it is a very lightweight protocol which (I've been told andwhich the SSA and TMSSA specifications seem to suggest) doesn't interfere with any other SSAactivity on your shared SSA loops.

    Q. Which of the logical volumes on my shared VGs do I need to mirror?

    A. Short answer: all of them. Medium answer: if there is a logical volume on your shared disks that isn't worth mirroring thendelete it. If it isn't worth mirroring then it isn't worth keeping. Long answer: you need to mirror all logical volumes which your application uses. Thisspecifically includes temporary space (eg. a big file system used by the application for data caching

    purposes). The reason is simple: if you lose a disk that contains the only copy of any of your logicalvolumes then the application will either hang or suffer disk I/O errors when it tries to access the lostspace. This could seriously affect the availability of your application.

    Bottom line: mirror EVERYTHING on the shared VGs. Do NOT use the mirrorvg commandunless you are REALLY careful or you have only a two-disk shared VG. You need to be careful to getthe mirrors onto the right physical volumes to ensure that the two halves of each mirror are in/ondifferent adapters, busses, paths, power supplies, disk cabinets, nuclear reactors, etc, etc, etc.

    Q. What about rootvg? Do I need to mirror it?

    A. Short answer: yes.

    Medium answer: yes. Use a two-disk rootvg and check out the mirrorvg command (part of AIXsince AIX 4.2.1) for an easy way to do this. Put the disks on separate controllers if at all possible.Read the man page carefully - there are some important issues that you need to get right whenmirroring rootvg.Long answer: you need to mirror all logical volumes which your application uses. The last time that

    we checked, all applications use the operating system and the operating system (i.e. AIX) uses rootvg

    logical volumes so mirror them. One that is sometimes missed is the paging space. Mirror the pagingspace if you want your node to be able to survive the loss of a physical volume containing pagingspace (i.e. if you don't mirror it then you've got a single point of failure). Most existing HACMP clusters are configured to send system dumps to the primary pagingspace. Some (all?) versions of AIX won't generate a system dump into a mirrored paging space.Create a separate unmirrored dump space if this restriction applies to your version of AIX (assumethat it does if you aren't sure). Even if AIX supported them, a mirrored dump space would be a wasteof disk space since you won't need it unless the AIX kernel crashes and a kernel crash combined witha disk failure is either a double failure or was almost certainly caused by the disk failure.

    Another factor to consider is that AIX doesn't respond well if an unmirrored rootvg physicalvolume is lost. Basically what happens is that the parts of AIX which notice the problem either hang

    or are terminated by the disk errors. The end result is that the operating system's features and facilitiesgradually stop working until, eventually, someone notices (i.e. a human) or something criticalterminates (if the HACMP cluster manager is ever affected (rather unlikely but it happens) then the

  • 8/2/2019 HACMPDOC

    7/18

    node will die about fifteen seconds later. Either eventuality could take a LONG time (think in terms ofhours of elapsed time) during which random parts of your application will have probably eitherterminated or hanged. Bottom line: mirror rootvg. Use the mirrorvg command and then follow the extra stepsdescribed in the mirrorvg man page for your version of AIX.

    HACMP configurationCONFIGURING NETWORK INTERFACE ADAPTERS

    In our example, we have two NICs, one used as cluster interconnects and other as bootable adapter.The service IP address will be activated on the bootable adapter after cluster services are started onthe nodes. Following IP addresses are used in the setup:

    NODE1: hostname btcppesrv5Boot IP address - 10.73.70.155 btcppesrv5

    Netmask - 255.255.254.0

    Interconnect IP address - 192.168.73.100 btcppesrv5iNetmask - 255.255.255.0Service IP address - 10.73.68.222 btcppesrv5sv

    Netmask - 255.255.254.0

    NODE2: hostname btcppesrv6Boot IP address - 10.73.70.156 btcppesrv6

    Netmask - 255.255.254.0Interconnect IP address - 192.168.73.101 btcppesrv6i

    Netmask - 255.255.255.0Service IP address - 10.73.68.223 btcppesrv6sv

    Netmask - 255.255.254.0

    EDITING CONFIGURATION FILES FOR HACMP

    1. /usr/sbin/cluster/netmon.cf All the IP addresses present in the network need to be entered in thisfile. Refer to Appendix for sample file.2. /usr/sbin/cluster/etc/clhosts All the IP addresses present in the network need to be entered in thisfile. Refer to Appendix for sample file.3. /usr/sbin/cluster/etc/rhosts - All the IP addresses present in the network need to be entered in thisfile. Refer to Appendix for sample file.4. /.rhosts All the IP addresses present in the network with username (i.e. root) need to be entered in

    this file. Refer to Appendix for sample file.5. /etc/hosts All the IP addresses with their IP labels present in network need to be entered in thisfile. Refer to Appendix for sample file.

    Note: All the above mentioned files need to be configured on both the nodes of cluster.

    CREATING CLUSTER USING SMIT

    This is a sample HACMP configuration that might require customization for your environment. Thissection demonstrates how to configure two AIX nodes, btcppesrv5 and btcppesrv6, into a

    HACMP cluster.

    1. Configure two AIX nodes to allow the user root to use the rcp and remsh commands between

  • 8/2/2019 HACMPDOC

    8/18

    themselves without having to specify a password.2. Log in as user root on AIX node btcppesrv5.3. Enter the following command to create an HACMP cluster.# smit hacmpPerform the following steps. These instructions assume that you are using the graphical user interfaceto SMIT (that is, smit M). If you are using the ASCII interface to SMIT (that is, smit C), modify

    these instructions accordingly.a) Click Initialization and Standard Configuration.

    b) Click Configure an HACMP Cluster and Nodes.c) In the Cluster Name field, enter netapp.d) In the New Nodes (via selected communication paths) field, enter btcppesrv5 and btcppesrv6.e) Click OK.f) Click Done.g) Click Cancel.h) Select the Exit > Exit SMIT Menu option.

    4. Enter the following command to configure the heartbeat networks as private networks.# smit hacmpPerform the following steps.a) Click Extended Configuration.

    b) Click Extended Topology Configuration.c) Click Configure HACMP Networks.d) Click Change/Show a Network in the HACMP cluster.e) Select net_ether_01 (192.168.73.0/24).f) In the Network Attribute field, select private.g) Click OK.h) Click Done.

    i) Click Cancel.j) Select the Exit > Exit SMIT Menu option.

    5. Enter the following command to configure Service IP Labels/Addresses.# smit hacmp

    Perform the following steps.a) Click Initialization and Standard Configuration.

    b) Click Configure Resources to Make Highly Available.c) Click Configure Service IP Labels / Addresses.d) Click Add a Service IP Label / Address.e) In the IP Label / Address field, enter btcppsrv5sv.f) In the Network Name field, select net_ether_02 (10.73.70.0/23). The Service IP label will beactivated on network interface 10.73.70.0/23 after cluster service starts.g) Click OK.h) Click Done.i) Click Cancel.

    j) Similarly follow steps d) to h) for adding second service IP label btcppsrv6sv.k) Select the Exit > Exit SMIT Menu option.

    6. Enter the following command to create Empty Resource Groups with Node priorities.# smit hacmp

    Perform the following steps.a) Click Initialization and Standard Configuration.

    b) Click Configure HACMP Resource Groups.

  • 8/2/2019 HACMPDOC

    9/18

    c) Click Add a Resource Group.d) In the Resource Group Name field, enter RG1.e) In the Participating Nodes (Default Node Priority) field, enter btcppesrv5 and btcppesrv6. TheResource Group RG1 will be online on btcppsrv5 first when cluster service starts; in the event offailure RG1 will be taken over by btcppesrv6 as the node priority for RG1 is assigned to btcppesrv5first.

    f) Click OK.g) Click Done.h) Click Cancel.i) Similarly follow steps d) to h) for adding second Resource group RG2 with node priority firstassigned to btcppesrv6.

    j) Select the Exit > Exit SMIT Menu option.

    7. Enter the following command to make Service IP labels part of Resource Groups.# smit hacmpPerform the following steps.a) Click Initialization and Standard Configuration.

    b) Click Configure HACMP Resource Groups.c) Click Change/Show Resources for a Resource Group (standard).d) Select a resource Group from pick list as RG1.e) In the Service IP Labels / Addresses field, enter btcppesrv5sv. As btcppesrv5sv service IP label hasto be activated on first node btcppesrv5.f) Click OK.g) Click Done.h) Click Cancel.i) Similarly follow steps c) to h) for adding second Service IP Label btcppesrv6sv in Resource GroupRG2. A btcppesrv6sv service IP label has to be activated on second node btcppesrv6.

    j) Select the Exit > Exit SMIT Menu option.

    VERIFYING AND SYNCHRONIZING CLUSTER USING SMIT

    This section demonstrates how to verify and synchronize the nodes in an HACMP cluster. Thisprocess of verification and synchronization actually verifies the HACMP configuration done from onenode and then synchronizes to other node in the cluster. So whenever there are any changes to be donein the HACMP cluster, they are required to be done from a single node and to be synchronized withother nodes.1. Log in as user root on AIX node btcppesrv5.

    2. Enter following command to verify and synchronize all nodes in HACMP cluster.# smit hacmpPerform the following steps.a) Click Initialization and Standard Configuration.

    b) Click Verify and Synchronize HACMP Configuration.c) Click Done.d) Select the Exit > Exit SMIT Menu option.

    STARTING CLUSTER SERVICES

    This section demonstrates how to start an HACMP cluster on both the participating nodes.1. Log in as user root on AIX node btcppesrv5.2. Enter following command to start HACMP cluster.

  • 8/2/2019 HACMPDOC

    10/18

    # smit cl_adminPerform the following steps.a) Click Manage HACMP services.

    b) Click Start Cluster Services.c) In the Start Now, on System Restart or Both fields, select now.d) In the Start Cluster Services on these nodes field, enter btcppesrv5 and btcppesrv6. The cluster

    services can be started on both the nodes simultaneously.e) In the Startup Cluster Information Daemon field, select true.f) Click OK.g) Click Done.h) Click Cancel.i) Select the Exit > Exit SMIT Menu option.

    STOPPING CLUSTER SERVICES

    This section demonstrates how to stop an HACMP cluster on both the participating nodes.

    1. Log in as user root on AIX node btcppesrv5.2. Enter following command to stop HACMP cluster.# smit cl_adminPerform the following steps.a) Click Manage HACMP services.

    b) Click Stop Cluster Services.c) In the Stop Now, on System Restart or Both fields, select now.d) In the Stop Cluster Services on these nodes field, enter btcppesrv5 and btcppesrv6. The clusterservices can be stopped on both the nodes simultaneously.e) Click OK.f) In the Are You Sure? Dialog box, click OK.g) Click Done.h) Click Cancel.i) Select the Exit > Exit SMIT Menu option.

    CONFIGURING DISK HEARTBEAT

    For configuring Disk Heartbeating, it is required to create the Enhanced Concurrent Capable Volumegroup on both the AIX nodes.To be able to use HACMP C-SPC successfully, it is required that some

    basic IP based topology already exists, and that the storage devices have their PVIDs on both systems

    ODMs. This can be verified by running lspv command on each AIX node. If a PVID does not exist onany AIX node, it is necessary to runchdev l

    btcppesrv5#> chdev l hdisk3 a pv=yesbtcppesrv6#> chdev l hdisk3 a pv=yes

    This will allow C-SPOC to match up the device(s) as known shared storage devices.This demonstrates how to create Enhanced Concurrent Volume Group:

    1. Log in as user root on AIX node btcppesrv5.

    2. Enter following command to create Enhanced concurrent VG.# smit vgPerform the following steps.

  • 8/2/2019 HACMPDOC

    11/18

    a) Click Add Volume Group.b) Click Add an Original Group.c) In the Volume group name field, enter heartbeat.d) In the Physical Volume Names field, enter hdisk3.e) In the Volume Group Major number field, enter 59. This number is the number available for a

    particular AIX node; it can be found out from the available list in the field.

    f) In the Create VG concurrent capable field, enter YES.g) Click OK.h) Click Done.i) Click Cancel.

    j) Select the Exit > Exit SMIT Menu option.

    On btcppesrv5 AIX node check the newly created volume group using command lsvg.On second AIX node enter importvg V

    btcppesrv6#> importvg -V 59 -y heartbeat hdisk3

    Since the enhanced concurrent volume groups are available for both the AIX nodes, we will usediscovery method of HACMP to find the disks available for Heartbeat.

    This demonstrates how to configure Disk heartbeat in HACMP:

    1. Log in as user root on AIX node btcppesrv5.

    2. Enter following command to configure Disk heartbeat.# smit hacmpPerform the following steps.a) Click Extended Configuration.

    b) Click Discover HACMP-related information from configured Nodes. This will run automaticallyand create /usr/es/sbin/cluster/etc/config/clvg_config file that contains the information it hasdiscovered.c) Click Done.d) Click Cancel.e) Click Extended Configuration.f) Click Extended Topology Configuration.g) Click Configure HACMP communication Interfaces/Devices.h) Click Add Communication Interfaces/Devices.i) Click Add Discovered Communication Interfaces and Devices.

    j) Click Communication Devices.k) Select both the Devices listed in the list.l) Click Done.m) Click Cancel.n) Select the Exit > Exit SMIT Menu option.

    It is necessary to add the Volume group into HACMP Resource Group and synchronize the cluster.

    Enter the following command to create Empty Resource Groups with different policies than what wecreated earlier.# smit hacmp

    Perform the following steps.a) Click Initialization and Standard Configuration.

    b) Click Configure HACMP Resource Groups.

  • 8/2/2019 HACMPDOC

    12/18

    c) Click Add a Resource Group.d) In the Resource Group Name field, enter RG3.e) In the Participating Nodes (Default Node Priority) field, enter btcppesrv5 and btcppesrv6.f) In the Startup policy field, enter Online On All Available Nodes.g) In the Fallover Policy field, enter Bring Offline (On Error Node Only).h) In the Fallback Policy field, enter never Fallback.

    i) Click OK.j) Click Done.k) Click Cancel.l) Click Change/Show Resources for a Resource Group (Standard).m) Select RG3 from the list.n) In the Volume Groups field, enter heartbeat. The concurrent capable volume group which wascreated earlier.o) Click OK.

    p) Click Done.q) Click Cancel.r) Select the Exit > Exit SMIT Menu option.

    Normally the commands starting with CL* are located in /usr/es/sbin/cluster/utilities/

    clstat - show cluster state and substate; needs clinfo.cldump - SNMP-based tool to show cluster statecldisp - similar to cldump, perl script to show cluster state.cltopinfo - list the local view of the cluster topology.clshowsrv-a - list the local view of the cluster subsystems.clfindres(-s) - locate the resource groups and display status.clRGinfo -v - locate the resource groups and display status.clcycle - rotate some of the log files.

    cl_ping - a cluster ping program with more arguments.clrsh - cluster rsh program that take cluster node names as argument.clgetactivenodes - which nodes are active?get_local_nodename - what is the name of the local node?clconfig - check the HACMP ODM.clRGmove - online/offline or move resource groups.cldare - sync/fix the cluster.cllsgrp - list the resource groups.clsnapshotinfo - create a large snapshot of the hacmp configuration.cllscf- list the network configuration of an hacmp cluster.clshowres - show the resource group configuration.

    cllsif- show network interface information.cllsres - show short resource group information.cllsnode - list a node centric overview of the hacmp configuration.lssrc -ls clstrmgrES - list the cluster manager state.lssrc -ls topsvcs - show heartbeat information.

    Q. I have a rented dedicated LAMP server and I need to know what version of Apache I am running.How do I find out my Apache server version? How do I find out what modules loaded?

    A. httpd is the Apache HyperText Transfer Protocol (HTTP) server program. In order to find outapache version login to server using ssh. Once logged in type the following command to print the

  • 8/2/2019 HACMPDOC

    13/18

    version of httpd, and then exit:# httpd -v

    To dump a list of loaded Static and Shared Modules:# httpd -M

    Print the version and build parameters of httpd, and then exit

    # httpd -V

    If you made any changes to httpd.conf, check httpd syntax for error using the -t option:# httpd -t

    Q. How do I hide the Apache version number under CentOS Linux 5 server?

    A. You can easily hide Apche (httpd) version number and other information. There are two configdirectives that controls Apache version. The ServerSignature directive adds a line containing theApache HTTP Server server version and the ServerName to any server-generated documents, such as

    error messages sent back to clients. ServerSignature is set to on by default. The ServerTokensdirective controls whether Server response header field which is sent back to clients includes adescription of the generic OS-type of the server as well as information about compiled-in modules.By setting this to Prod you only displays back Apache as server name and no version numberdisplayed back.

    Open your httpd.conf file using text editor such as vi & Append/modify config directive as follows:ServerSignature OffServerTokens Prod

    Save and close the file. Restart Apache web server:

    Q. How do I find out syntax errors in my Apache web server configuration file?

    # /usr/sbin/httpd -t

    Q. You are root but you are still unable to delete a file. Why?

    A. The immutable attribute is set up!

    rm -f /etc/inittabrm: cannot remove /etc/inittab: Operation not permited

    lsattr /etc/inittab-i- /etc/inittab

    chattr -i /etc/inittab

    Q. What files should you usually change attributes for?

    A. When configuring systems which will be directly exposed to the Internet or other hostileenvironments and which must host shell accounts or services such as HTTP and FTP, I usually add theappend-only and immutable flags once I have installed and configured all necessary software and useraccounts:

    chattr -R +i /bin /boot /etc /lib /sbinchattr -R +i /usr/bin /usr/include /usr/lib /usr/sbin

    chattr +a /var/log/messages /var/log/securechattr -R +i /usr

    http://www.linuxinterviewquestions.com/you-are-root-but-you-are-still-unable-to-delete-a-file-why/http://www.linuxinterviewquestions.com/what-files-should-you-usually-change-attributes-for/http://www.linuxinterviewquestions.com/you-are-root-but-you-are-still-unable-to-delete-a-file-why/http://www.linuxinterviewquestions.com/what-files-should-you-usually-change-attributes-for/
  • 8/2/2019 HACMPDOC

    14/18

    Q. What Are the CLI user tools?

    A. gpasswd method of administering the /etc/group filepwck, grpck tools used for the verification of the password, group, and associated shadow filespwconv, pwuconv tools used for the conversion of passwords to shadow passwords and back tostandard passwords.

    Q. How do you add a completely unprivileged user for ftp?

    A. useradd d /var/www/html/ -n M s /dev/null ftpsecure

    Q. What is the difference between SU and SU ?

    A. su only changes to root permissions, su (with the -) changes the environment (variables) toroot.

    Q. How do you change the selinux context of a file to its default context?

    A. restorecon

    Q. How do you check to see if SELinux is in enforcing mode?

    A. getenforce

    Q. What is the difference between atime, ctime and mtime?

    A. Writing to a file changes its mtime and ctime, while reading a file changes its atime

    atime is when the file was last read ls -la time=atimectime is the inode change time ls -la time=ctime

    mtime is the file modification time. ls luMtime changes when you write to the file. It is the age of the data in the file. Whenever mtimechanges, so does ctime. But ctime changes metadata as well. For example, it will change if youchange the owner or the permissions on the file or by renaming the file.

    Unlike atime and mtime, ctime cannot be set with utime() as used by touch; the only way to set it toan arbitrary value is by changing the system clock.

    Q. Can you have /boot partition on an LVM?

    A. The /boot/ partition cannot reside on an LVM volume because the GRUB boot loader cannot

    read it.

    Q. What are ZONE files?

    A. Zone files contain information about a particular namespace. Zone files are stored in /var/namedworking directory. Each zone file is named according to the file option data in the zone statement,usually in a way that relates to the domain in question and identifies the file as containing zone data,such as example.com.zone.

    Each zone file may contain directives and resource records. Directives tell the nameserver to do acertain thing or apply a special setting to the zone. Resource records define the parameters of the

    zone, assigning an identity within the zones namespace to particular systems. Directives areoptional, but resource records are required to provide nameservice to that zone. All directives andresource records should go on their own lines.

    http://www.linuxinterviewquestions.com/what-are-the-cli-user-tools/http://www.linuxinterviewquestions.com/how-do-you-add-a-completely-unprivileged-user-for-ftp/http://www.linuxinterviewquestions.com/what-is-the-difference-between-%E2%80%9Csuhttp://www.linuxinterviewquestions.com/how-do-you-change-the-selinux-context-of-a-file-to-its-default-context/http://www.linuxinterviewquestions.com/how-do-you-check-to-see-if-selinux-is-in-enforcing-mode/http://www.linuxinterviewquestions.com/what-is-the-difference-between-atime-ctime-and-mtime/http://www.linuxinterviewquestions.com/can-you-have-boot-partition-on-an-lvm/http://www.linuxinterviewquestions.com/what-are-zone-files/http://www.linuxinterviewquestions.com/what-are-the-cli-user-tools/http://www.linuxinterviewquestions.com/how-do-you-add-a-completely-unprivileged-user-for-ftp/http://www.linuxinterviewquestions.com/what-is-the-difference-between-%E2%80%9Csuhttp://www.linuxinterviewquestions.com/how-do-you-change-the-selinux-context-of-a-file-to-its-default-context/http://www.linuxinterviewquestions.com/how-do-you-check-to-see-if-selinux-is-in-enforcing-mode/http://www.linuxinterviewquestions.com/what-is-the-difference-between-atime-ctime-and-mtime/http://www.linuxinterviewquestions.com/can-you-have-boot-partition-on-an-lvm/http://www.linuxinterviewquestions.com/what-are-zone-files/
  • 8/2/2019 HACMPDOC

    15/18

    $ vi /var/named/zones/llc.com.db

    llc.com. IN SOA dns1.llc.com. root.dns1.llc.com. (001 ; serial1H ; refresh15M ; retry1W ; expiry

    1H ; ttl)

    @ IN NS dns1

    dns1 IN A 192.168.2.5@ IN A 192.168.2.5

    www IN CNAME dns1

    redhat.llc.com. IN NS dns1.redhat.llc.com.dns1.redhat.llc.com. IN A 192.168.2.10

    $ vi /var/named/zones/2.168.192.dbllc.com. IN SOA dns1.llc.com. root.dns1.llc.com. (001 ; serial1H ; refresh15M ; retry1W ; expiry1H ; ttl)

    @ IN NS dns15 IN PTR dns1.llc.com.

    Q. What is FQDN and a secondary nameserver?A. FQDN of a host can be broken down into sections organized in a tree hierarchy. Except for thehostname, every section divided by . is a called azone.Zones are defined on authoritative nameservers inzone files. Zone files are stored onprimarynameservers (also called master nameservers), which are truly authoritative and where changes aremade to the files.

    Secondary nameservers (also called slave nameservers) receive their zone files from the primarynameservers. Any nameserver can be a primary and secondary nameserver for different zones at thesame time, and they may also be considered authoritative for multiple zones. It all depends on thenameservers particular configuration.

    Every second level domain should have one primary and one secondary nameserver running ondifferent physical machines for redundancy.

    There are four nameserver configuration types:

    master Stores original and authoritative zone records for a certain zone, answering questionsfrom other nameservers searching for answers concerning that namespace.

    slave Also answers queries from other nameservers concerning namespaces for which it isconsidered an authority. However, slave nameservers get their namespace information from masternameservers via azone transfer, where the slave sends the master a NOTIFY request for a particular

    zone and the master responds with the information, if the slave is authorized to receive the transfer.

    caching-only Offers name to IP resolution services but is not authoritative for any zones.

    http://www.linuxinterviewquestions.com/what-is-fqdn-and-a-secondary-nameserver/http://www.linuxinterviewquestions.com/what-is-fqdn-and-a-secondary-nameserver/
  • 8/2/2019 HACMPDOC

    16/18

    Answers for all resolutions are usually cached in a database stored in memory for a fixed period oftime, usually specified by the retrieved zone record, for quicker resolution for other DNS clientsafter the first resolution.

    forwarding Forwards requests to a specific list of nameservers to be resolved. If none of thespecified nameservers can perform the resolution, the process stops and the resolution fails.

    Q. How DNS resolution works?A. A client application requests an IP address from the nameserver usually by connecting to UDPport 53. The nameserver will attempt to resolve the FQDN based on its resolver library, which maycontain authoritative information about the host requested or cached data about that name from anearlier query.If the nameserver does not already have the answer in its resolver library, it will turn to rootnameservers, to determine which nameservers are authoritative for the FQDN in question. Then,with that information, it will query the authoritative nameservers for that name to determine the IPaddress.

    Q. How do you list all the loaded Apache modules?A. You can use the following command to list all the loaded modules in apache (both DSO andStatic)

    apachectl -t -D DUMP_MODULES

    In apache 2.2.x, you can also use httpd -M to list all the loaded modules ( both static and DSO )

    Q. What is an INODE?A. All files have its description stored in a structure called inode. The inode contains info aboutthe file-size, access and modification time, permission and so on. In addition to descriptions aboutthe file, the inode contains pointers to the data blocks of the file.

    Q. Describe Linux boot-up sequenceA. BIOS reads the MBR where Boot Loader sits, Boot Loader reads Kernel into memory, Kernelstarts Init process, Init reads inittab, executes rc.sysinit, the rc script than starts services to reach thedefault run level and once this is done the last thing that gets run is the rc.local script.

    Q. HOW TO CREATE SAMBA SERVER?

    A. Step # 1: Add a user joe to UNIX/Linux systemadduser command adds user to the system according to command line options and configurationinformation in /etc/adduser.conf. They are friendlier front ends to the low level tools like useradd.Step # 2: Add a user to sambaNow user joe has account on Linux/UNIX box. Use smbpasswd command to specifies that theusername following should be added to the local smbpasswd file:

    # smbpasswd -a joeStep # 3: Add a user to a Samab share

    By default user gets access to /home/joe from windows system. Let us say you want to givejoe access to /data/accounts (make sure directory /data/accounts exists) directory. Open

    /etc/samba/smb.conf file and add/modify share called [accounts]:[accounts]comment = Accounts data directory

    http://www.linuxinterviewquestions.com/how-dns-resolution-works/http://www.linuxinterviewquestions.com/how-do-you-list-all-the-loaded-apache-modules/http://www.linuxinterviewquestions.com/how-dns-resolution-works/http://www.linuxinterviewquestions.com/how-do-you-list-all-the-loaded-apache-modules/
  • 8/2/2019 HACMPDOC

    17/18

    path = /data/accountsvalid users = vivek raj joepublic = nowritable = yes

    Save the file.Step #4: Restart the samba

    # service smb restart

    Q. All my local Linux user accounts will be able to log in to my Samba server and accessshare. How do I restrict access to particular users or network subnet such as 192.168.2.1/24?A. You can use TCP wrappers to limit subnet access via:

    1. /etc/hosts.allow - This file describes the names of the hosts which are allowed to use thelocal INET services, as decided by the /usr/sbin/tcpd server.

    2. /etc/hosts.deny - This file describes the names of the hosts which are NOT allowed to use thelocal INET services, as decided by the /usr/sbin/tcpd server.

    For example, allow access to smbd service inside LAN only via /etc/hosts.allow:smbd : 192.168.2.

    hosts allow: Samba ConfigurationOpen your smb.conf file and add the following line to [share]

    [share]hosts allow = 192.168.2. 127.0.0.1

    valid users: Samba ConfigurationOpen your smb.conf file and add the following line to [share]

    [share]valid users = user1 user2 @group1 @group2

    read only & write only: Samba ConfigurationYou can also set read and write access to set of users with the read list and

    write list directives.[share]

    read only = yeswrite list = user1 user2 @group1 @group2

    ExamplesMake [sales] share read only but allow user tom and jerry to write it:

    [sales]comment = All Printerspath = /nas/fs/salesread only = yeswrite list = tom jerry

    Q. How can I configure Samba to use domain accounts for authentication, so that user will beauthenticated?A. Samba server provides an options that allows authentication against a domain controller. Edityour smb.conf file using vi text editor:Type the following command as root user

    # vi /etc/samba/smb.confMake sure parameters are set as follows [global] section of smb.conf file:

    workgroup = YOUR-DOMAIN-CONTROLLERnetbios name = YOUR-SAMBA-SERVER-NAMEpassword server = IP-ADDRESS-OF-YOUR-DOMAIN-CONTROLLER

    encrypt passwords = Yespreferred master = Nodomain master = No

  • 8/2/2019 HACMPDOC

    18/18

    Where,

    Workgroup: This controls what workgroup your server will appear to be in when queried byclients.

    netbios name : This sets the NetBIOS name by which a Samba server is known.

    encrypt passwords : This boolean controls (YES or NO value) whether encrypted passwordswill be used with the client. Note that Windows NT 4.0 SP3 and above and also Windows 98

    will by default expect encrypted passwords unless a registry entry is changed. This is whatyou need to use for Window XP/2000/2003 systems.

    Restart samba serve:# /etc/init.d/samba restart