date: 7/04/2010 department of aerospace engineering indian institute of technology bombay

Post on 01-Jan-2016

34 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay. Installation of Cluster using Rocks Vadivelan Vighnesh. Hardware configuration. Pre-installation knowledge. Supported hardware – Hyper will not support Rocks-5.2 and Hyperx will not support Rocks-4.3 - PowerPoint PPT Presentation

TRANSCRIPT

Installation of Cluster using Rocks

VadivelanVighnesh

Date: 7/04/2010Department of Aerospace EngineeringIndian Institute of Technology Bombay

Hardware configuration

Hyper

Model HP DL 140 G3 HP DL 160 G5 HP DL 160 G5

CPU cores 2 4 4

Processor/core 2 2 2

Processor configuration

Cache Size 4096 KB 6144 KB 6144 KB

RAM 4040864 B 8182224 B 8182224 B

Hard disk 60 GB 160 GB 2 TB

Hyperx Headnode Hyperx NAS

Intel Xeon CPU 5150 @ 2.66 Ghz Intel Xeon CPU E5430 @ 2.66 GHz Intel Xeon CPU E5430 @ 2.66 GHz

Pre-installation knowledge

Supported hardware – Hyper will not support Rocks-5.2 and Hyperx will not support Rocks-4.3

Eth1 and eth0 – Check which port is public and private from Catlog. For Rocks eth0→private and eth1→public

Cable connections – Between node-to-ethernet switch and ethernet-to-ethernet

Compatibility with OS that is going to be installed – Check OS is compatibility with Softwares like PGI

Availability of required rolls CDS or DVDS for rocks installation

Source: http://www.rocksclusters.org/wordpress

Rolls

Rolls Base – contains basic utilities for starting cluster

installation

Bio - is a collection of some of the most common bio-informatics tools

Ganglia - installs and configures the Ganglia cluster monitoring system

HPC – OPENMPI, MPICH, MPICH2 packages are included in this roll

Kernel – includes all the kernel utilities

OS – contains operating system utilities, Centos 5.1 for rocks 5.2

Rolls Area 51 - contains utilities and services used to

analyze the integrity of the files and the kernel on your cluster

Torque – contains torque/maui job scheduling and queuing tools

Web-server – required for setting up cluster on public internet for monitoring purpose

Service-pack – contains all the bug fixes for rocks version

PGI – contains all the pgi compiler packages

Intel-Developer – contains all the intel compiler packages

Hyperx Cluster Layout

Building of rocks cluster :The beginning

Download iso images of rolls from rocks web site Burn the iso image of package jumbo roll on dvd Burn the iso images of additional required rolls on cds Mount the jumbo roll DVD on the DVD rack of node which

will act as frontend node of the cluster Or connect the USB of external DVD drive to frontend if

there is no onboard DVD drive

Building of rocks cluster :Frontend installation

Boot the frontend node. The boot order in bios should be set to CD/DVD drive first

After frontend boots up following screen will be displayed

If you have used onboard CD/DVD drive type 'build'

If you have used external CD/DVD drive type 'build driverload=usb-storage'

Note: The above screen will remain for few seconds, if you missed to type this commands the node will be installed as a compute node and you have to reboot the frontend and reinstall node as a frontend

Building of rocks cluster :Frontend installation

Soon the following screen will be displayed

From this screen, you'll select your rolls.

In this procedure, we'll only be using CD media, so we'll only be clicking on the 'CD/DVD-based Roll' button.

Click the 'CD/DVD-based Roll' button

Building of rocks cluster :Frontend installation

The CD tray will eject and you will see this screen

Put your first roll in the CD tray (for the first roll, since the Jumbo Roll DVD is already in the tray, simply push the tray back in).

Click the 'Continue' button.

Building of rocks cluster :Frontend installation

The Rolls will be discovered and display the screen

Select the desired rolls listed earlier and click submit

Building of rocks cluster :Frontend installation

Network settings for private (eth0)

Hyperx : 192.168.2.1/255.255.0.0

Building of rocks cluster :Frontend installation

Network settings for public (eth1)

Hyperx : 10.101.2.3/255.255.255.0

Building of rocks cluster :Frontend installation

Network Settings

Hyperx : 10.101.250.1/10.200.1.11

Building of rocks cluster :Frontend installation

Head Node Partition

By default Auto partition will work

Building of rocks cluster :Frontend installation

Manual Partition done for hyperx

/root : 20GB

swap : 8GB

/var : 4GB

/export : Remaining

Building of rocks cluster :Frontend installation

Installation Process Starts

Building of rocks cluster :Frontend disk partition

$df -hFilesystem Size Used Avail Use% Mounted on/dev/sda1 19G 6.4G 12G 36% //dev/sda4 107G 5.8G 96G 6% /export/dev/sda3 3.8G 429M 3.2G 12% /vartmpfs 4.0G 0 4.0G 0% /dev/shmtmpfs 2.0G 16M 1.9G 1% /var/lib/ganglia/rrdsnas-0-0:/export/data1 1.3T 78G 1.2T 7% /export/home

Building of rocks cluster :Client installation

Login as a root in frontend and type 'Insert-ethers'

PXE boot the node which will act as first compute node for cluster

For compute nodes installation select Compute

Building of rocks cluster :Client installation

After frontend accepts dhcp requests from compute node, comunication between frontend and compute will be established and following screen will be displayed

During the process, frontend will detect the mac id of the compute node which is going to install

Building of rocks cluster :Client installation

After discovering the mac id, frontend will allot hostname and ipaddress for the particular compute node

Frontend will name compute-0-0 to first compute node detected and will continue as compute-0-1, compute-0-2 and so on

Image shows, kickstart started on the compute node (which means installation process started)

Building of rocks cluster :Client installation

The installation process starts and following screen will be displayed on compute node

For each compute node installation follow the steps presented in slide 20 to 23

After all the compute nodes in rack 0 are installed, for installing compute nodes on rack 1 type 'insert-ethers -rack=1 -rank=0'. This will start naming the compute nodes in rack 1 as compute-1-0 and so on

Building of rocks cluster :NAS installation

Login as root in frontend and type 'insert-ethers'

PXE boot the node which will act as NAS node of cluster

Frontend will name as a nas-0-0 for first NAS node detected and so on, same as that for compute nodes

For I/O node installation Select NAS Appliance

Building of rocks cluster :NAS Configuration

All installation steps of NAS are similar to compute. Accept the manual partitioning of NAS disk space

For hyperx NAS node: Manual partitioning /root : 20GB

swap : 8GB

/var : 4GB

/export : Remaining

In NAS the default home directory is /export/data1

Building of rocks cluster :NAS partition

$df -hFilesystem Size Used Avail Use% Mounted on/dev/cciss/c0d0p4 19G 3.4G 15G 19% //dev/mapper/VolGroup00-LogVol00 1.3T 78G 1.2T 7% /export/data1/dev/cciss/c0d0p2 1.9G 90M 1.8G 5% /boottmpfs 4.0G 0 4.0G 0% /dev/shm

Building of rocks cluster :NAS NFS configuration

After Nas node boots up, carry out the following steps to set NAS as NFS server of the cluster:

Edit the file /etc/exports in nas node and add the following lines

“/export/data1 192.168.2.0/255.255.255.0(rw,no_root_squash,sync) “

Edit the file /etc/fstab in head node and add the following lines

“nas-0-0:/export/data1 /export/home nfs defaults 0 0”

Run the command

#mount -a

This steps will mount /export/data1 directory of NAS node on /export/home directory of every other nodes of cluster with private ip address ranging from 192.168.2.0 to 192.168.2.255

Building of rocks cluster :NAS imp. directories

$ls -l /export/total 4drwxr-xr-x 25 root root 4096 Apr 17 09:47 data1$ls -l /export/data1/total 128drwx------ 18 amjad amjad 4096 Mar 30 09:52 amjad -rw------- 1 root root 9216 Apr 17 09:47 aquota.group -rw------- 1 root root 9216 Apr 17 09:52 aquota.userdrwx------ 7 asgerali asgerali 4096 Apr 2 11:56 asgeralidrwx------ 6 avinash avinash 4096 Apr 17 10:00 avinash drwx------ 6 ayan ayan 4096 Jan 23 11:05 ayandrwx------ 12 bharat bharat 4096 Apr 16 13:38 bharatdrwx------ 10 halbe halbe 4096 Apr 11 21:57 halbedrwx------ 12 krish krish 4096 Mar 28 12:23 krishdrwx------ 2 root root 16384 Jan 5 22:05 lost+founddrwx------ 25 nileshjrane nileshjrane 4096 Apr 18 22:11 nileshjrane drwx------ 11 nitin nitin 4096 Mar 29 00:58 nitindrwx------ 23 pankaj pankaj 4096 Apr 16 13:27 pankaj drwx------ 4 prasham prasham 4096 Jan 28 21:08 prasham

Building of rocks cluster :frontend imp. directories

$ls -l /export/total 4drwxr-xr-x 9 root root 4096 Apr 16 11:53 appsdrwxr-xr-x 4 biouser root 4096 Jan 6 18:19 biodrwxr-xr-x 25 root root 4096 Apr 17 09:47 homedrwx------ 2 root root 16384 Jan 6 17:40 lost+founddrwxr-xr-x 3 root root 4096 Jun 18 2009 rocksdrwxr-xr-x 3 root root 4096 Jan 6 18:08 site-roll$ls -l /export/home/total 128drwx------ 18 amjad amjad 4096 Mar 30 09:52 amjad-rw------- 1 root root 9216 Apr 17 09:47 aquota.group-rw------- 1 root root 9216 Apr 17 09:52 aquota.userdrwx------ 7 asgerali asgerali 4096 Apr 2 11:56 asgerali$ ls -l /export/apps/total 20drwxr-xr-x 3 root root 4096 Jan 21 00:25 modulesdrwxr-xr-x 3 root root 4096 Apr 16 11:54 mpich2drwxr-xr-x 4 root root 4096 Jan 13 00:47 old.pgi-9.0.4drwxr-xr-x 3 root root 4096 Feb 17 09:50 openmpidrwxr-xr-x 4 root root 4096 Mar 11 15:47 pgi-7.2.4

Adding user and setting user home directories

For creating user account carry out the following steps as root login to frontend:

First run the commands#rocks set attr Info_HomeDirSrv nas-0-0.local#rocks set attr Info_HomeDirLoc /export/data1

Add user with the following command#adduser <username>

After user addition task is over, to synchronise the user information throughout all the nodes of cluster, run the command#rocks sync users

To remove user account, run the commands:#umount /export/home/<username> (as root on frontend)#userdel <username> (as root on NAS)

Adding roll to running cluster

# rocks add roll /path/to/<roll_name>.iso# rocks enable roll <roll_name># rocks create distro# rocks run roll <roll_name> | sh# reboot

After the the frontend comes back up you should do the following to populate the node list:# rocks sync configthen kickstart all your nodes# tentakel /boot/kickstart/cluster-kickstartAfter the nodes are reinstalled they should automatically pop up in the queueing system.

For adding roll to already running cluster run the following commands on frontend as a root

Adding packages to the cluster(compilers/softwares)

All the packages has to be installed in /share/apps. This directory is the default NFS directory of frontend, mounted on all other nodes of the cluster.1) First example: Installation of pgiDownload tar package file from PGI site and run the following cmds.#tar zxvf pgi-7.2.4.tgz#cd pgi-7.2.4#./configure --prefix=/share/apps/pgi-7.2.4#make#make install This will install PGI-MPICH compiler in /share/apps directory of frontend and thus the compiler can be used on all the compute nodes Copy the license file into /share/apps/pgi-7.2.4/

Adding packages to the cluster II(compilers/softwares)

Add following lines in bash script file for setting up environment variable:

export PATH=/share/apps/pgi-7.2.4/linux86-64/7.2-4/bin:$PATHexport PGI=/share/apps/pgi-7.2.4

export LM_LICENSE_FILE=$PGI/license.datexport LD_LIBRARY_PATH=/share/apps/pgi-7.2.4/linux86-64/7.2-4/lib:$LD_LIBRARY_PATH

2) Second Example: Configuring OPENMPI with PGI

Download tar packaged file from http://www.open-mpi.org/ and

run the following commands#tar xvzf openmpi-1.3.tgz#cd openmpi-1.3#./configure CC=pgcc CXX=pgCC F77=pgf77 F90=pgf90 --prefix=/share/apps/openmpi/pgi/ --with-tm=/opt/torque#make#make install

Adding packages to the cluster III(compilers/softwares)

Add following lines in bash script file:export PATH=/share/apps/openmpi/pgi/bin:$PATHexport LD_LIBRARY_PATH=/share/apps/openmpi/pgi/lib:$LD_LIBRARY_PATH

Commands to compile and execute mpi codes For mpich-pgi shared library compilier use (fortran,C,C++) Compiling: $/share/apps/pgi-7.2.4/linux86-64/7.2/mpi/mpich/bin/mpif90 code.f -o code.exe Executing: $/share/apps/pgi-7.2.4/linux86-64/7.2/mpi/mpich/bin/mpirun -np <number> code.exe

For openmpi-pgi shared library compilier use (fortran,C,C++) Compiling: $/share/apps/openmpi/pgi-7.2.4/bin/mpif90 code.f -o code.exe Executing: $/share/apps/openmpi/pgi-7.2.4/bin/mpirun -np <number> code.exe

Commands to submit job in interactive mode or batch mode

In interactive mode:$qsub -I -q <dual or quad> -l nodes=<no. of nodes>:ppn=<4 or 8>

In batch mode:$qsub job.sh

Content of job.sh:$cat job.sh#!/bin/sh

#PBS -q quad -l nodes=2:ppn=8,walltime=12:00:00

code=code.exe#==================================# Dont modify below lines#==================================cd $PBS_O_WORKDIRecho `cat $PBS_NODEFILE` > host/usr/bin/killsh $code $PBS_O_WORKDIR

/share/apps/pgi-7.2.4/linux86-64/7.2/mpi/mpich/bin/mpirun -machinefile $PBS_NODEFILE -np `cat $PBS_NODEFILE | wc -l` ./$code

Building of rocks cluster :Quota Setup

Quota setup has to configured in I/O node only.

#quotacheck cvguf /home

#quotaon /home

#edquota <username>

Building of rocks cluster :Queue Setup on frontend

#qmgrQmgr:create queue dualQmgr:set queue dual queue_type = ExecutionQmgr:set queue dual acl_host_enable = FalseQmgr:set queue dual acl_hosts = compute-0-0Qmgr:set queue dual acl_hosts += compute-0-1Qmgr:set queue dual resources_default.walltime = 12:00:00Qmgr:set queue dual enabled = TrueQmgr:set queue dual started = True

Qmgr:create queue routeQmgr:set queue route queue_type = RouteQmgr:set queue route route_destinations = quadQmgr:set queue route route_destinations += dualQmgr:set queue route enabled = FalseQmgr:set queue route started = TrueQmgr:exit

Building of rocks cluster :Check current queue configuration

#qmgr -c 'p s'## Create queues and set their attributes.### Create and define queue route#create queue routeset queue route queue_type = Routeset queue route route_destinations = quadset queue route route_destinations += dualset queue route enabled = Falseset queue route started = True## Create and define queue quad#create queue quadset queue quad queue_type = Executionset queue quad acl_host_enable = Falseset queue quad acl_hosts = compute-1-9set queue quad acl_hosts += compute-1-8set queue dual resources_default.walltime = 12:00:00set queue dual enabled = Trueset queue dual started = True

## Create and define queue dual#create queue dualset queue dual queue_type = Executionset queue dual acl_host_enable = Falseset queue dual acl_hosts = compute-0-9set queue dual acl_hosts += compute-0-18set queue dual acl_hosts += compute-0-8set queue dual resources_default.walltime = 12:00:00set queue dual enabled = Trueset queue dual started = True

## Set server attributes.#set server scheduling = Trueset server acl_host_enable = Falseset server acl_hosts = hyperx.aero.iitb.ac.inset server managers = maui@hyperx.aero.iitb.ac.inset server managers += root@hyperx.aero.iitb.ac.inset server default_queue = routeset server log_events = 511set server mail_from = admset server query_other_jobs = Trueset server scheduler_iteration = 600set server node_check_rate = 150set server tcp_timeout = 6set server next_job_number = 3413

Building of rocks cluster :Check current queue configuration

Reboot hyperx rocks run hosts “reboot”

Thank You

top related