date: 7/04/2010 department of aerospace engineering indian institute of technology bombay

41
Installation of Cluster using Rocks Vadivelan Vighnesh Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Upload: camilla-shaffer

Post on 01-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay. Installation of Cluster using Rocks Vadivelan Vighnesh. Hardware configuration. Pre-installation knowledge. Supported hardware – Hyper will not support Rocks-5.2 and Hyperx will not support Rocks-4.3 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Installation of Cluster using Rocks

VadivelanVighnesh

Date: 7/04/2010Department of Aerospace EngineeringIndian Institute of Technology Bombay

Page 2: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Hardware configuration

Hyper

Model HP DL 140 G3 HP DL 160 G5 HP DL 160 G5

CPU cores 2 4 4

Processor/core 2 2 2

Processor configuration

Cache Size 4096 KB 6144 KB 6144 KB

RAM 4040864 B 8182224 B 8182224 B

Hard disk 60 GB 160 GB 2 TB

Hyperx Headnode Hyperx NAS

Intel Xeon CPU 5150 @ 2.66 Ghz Intel Xeon CPU E5430 @ 2.66 GHz Intel Xeon CPU E5430 @ 2.66 GHz

Page 3: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Pre-installation knowledge

Supported hardware – Hyper will not support Rocks-5.2 and Hyperx will not support Rocks-4.3

Eth1 and eth0 – Check which port is public and private from Catlog. For Rocks eth0→private and eth1→public

Cable connections – Between node-to-ethernet switch and ethernet-to-ethernet

Compatibility with OS that is going to be installed – Check OS is compatibility with Softwares like PGI

Availability of required rolls CDS or DVDS for rocks installation

Source: http://www.rocksclusters.org/wordpress

Page 4: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Rolls

Page 5: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Rolls Base – contains basic utilities for starting cluster

installation

Bio - is a collection of some of the most common bio-informatics tools

Ganglia - installs and configures the Ganglia cluster monitoring system

HPC – OPENMPI, MPICH, MPICH2 packages are included in this roll

Kernel – includes all the kernel utilities

OS – contains operating system utilities, Centos 5.1 for rocks 5.2

Page 6: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Rolls Area 51 - contains utilities and services used to

analyze the integrity of the files and the kernel on your cluster

Torque – contains torque/maui job scheduling and queuing tools

Web-server – required for setting up cluster on public internet for monitoring purpose

Service-pack – contains all the bug fixes for rocks version

PGI – contains all the pgi compiler packages

Intel-Developer – contains all the intel compiler packages

Page 7: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Hyperx Cluster Layout

Page 8: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :The beginning

Download iso images of rolls from rocks web site Burn the iso image of package jumbo roll on dvd Burn the iso images of additional required rolls on cds Mount the jumbo roll DVD on the DVD rack of node which

will act as frontend node of the cluster Or connect the USB of external DVD drive to frontend if

there is no onboard DVD drive

Page 9: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Boot the frontend node. The boot order in bios should be set to CD/DVD drive first

After frontend boots up following screen will be displayed

If you have used onboard CD/DVD drive type 'build'

If you have used external CD/DVD drive type 'build driverload=usb-storage'

Note: The above screen will remain for few seconds, if you missed to type this commands the node will be installed as a compute node and you have to reboot the frontend and reinstall node as a frontend

Page 10: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Soon the following screen will be displayed

From this screen, you'll select your rolls.

In this procedure, we'll only be using CD media, so we'll only be clicking on the 'CD/DVD-based Roll' button.

Click the 'CD/DVD-based Roll' button

Page 11: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

The CD tray will eject and you will see this screen

Put your first roll in the CD tray (for the first roll, since the Jumbo Roll DVD is already in the tray, simply push the tray back in).

Click the 'Continue' button.

Page 12: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

The Rolls will be discovered and display the screen

Select the desired rolls listed earlier and click submit

Page 13: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Network settings for private (eth0)

Hyperx : 192.168.2.1/255.255.0.0

Page 14: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Network settings for public (eth1)

Hyperx : 10.101.2.3/255.255.255.0

Page 15: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Network Settings

Hyperx : 10.101.250.1/10.200.1.11

Page 16: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Head Node Partition

By default Auto partition will work

Page 17: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Manual Partition done for hyperx

/root : 20GB

swap : 8GB

/var : 4GB

/export : Remaining

Page 18: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend installation

Installation Process Starts

Page 19: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Frontend disk partition

$df -hFilesystem Size Used Avail Use% Mounted on/dev/sda1 19G 6.4G 12G 36% //dev/sda4 107G 5.8G 96G 6% /export/dev/sda3 3.8G 429M 3.2G 12% /vartmpfs 4.0G 0 4.0G 0% /dev/shmtmpfs 2.0G 16M 1.9G 1% /var/lib/ganglia/rrdsnas-0-0:/export/data1 1.3T 78G 1.2T 7% /export/home

Page 20: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Client installation

Login as a root in frontend and type 'Insert-ethers'

PXE boot the node which will act as first compute node for cluster

For compute nodes installation select Compute

Page 21: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Client installation

After frontend accepts dhcp requests from compute node, comunication between frontend and compute will be established and following screen will be displayed

During the process, frontend will detect the mac id of the compute node which is going to install

Page 22: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Client installation

After discovering the mac id, frontend will allot hostname and ipaddress for the particular compute node

Frontend will name compute-0-0 to first compute node detected and will continue as compute-0-1, compute-0-2 and so on

Image shows, kickstart started on the compute node (which means installation process started)

Page 23: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Client installation

The installation process starts and following screen will be displayed on compute node

For each compute node installation follow the steps presented in slide 20 to 23

After all the compute nodes in rack 0 are installed, for installing compute nodes on rack 1 type 'insert-ethers -rack=1 -rank=0'. This will start naming the compute nodes in rack 1 as compute-1-0 and so on

Page 24: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :NAS installation

Login as root in frontend and type 'insert-ethers'

PXE boot the node which will act as NAS node of cluster

Frontend will name as a nas-0-0 for first NAS node detected and so on, same as that for compute nodes

For I/O node installation Select NAS Appliance

Page 25: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :NAS Configuration

All installation steps of NAS are similar to compute. Accept the manual partitioning of NAS disk space

For hyperx NAS node: Manual partitioning /root : 20GB

swap : 8GB

/var : 4GB

/export : Remaining

In NAS the default home directory is /export/data1

Page 26: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :NAS partition

$df -hFilesystem Size Used Avail Use% Mounted on/dev/cciss/c0d0p4 19G 3.4G 15G 19% //dev/mapper/VolGroup00-LogVol00 1.3T 78G 1.2T 7% /export/data1/dev/cciss/c0d0p2 1.9G 90M 1.8G 5% /boottmpfs 4.0G 0 4.0G 0% /dev/shm

Page 27: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :NAS NFS configuration

After Nas node boots up, carry out the following steps to set NAS as NFS server of the cluster:

Edit the file /etc/exports in nas node and add the following lines

“/export/data1 192.168.2.0/255.255.255.0(rw,no_root_squash,sync) “

Edit the file /etc/fstab in head node and add the following lines

“nas-0-0:/export/data1 /export/home nfs defaults 0 0”

Run the command

#mount -a

This steps will mount /export/data1 directory of NAS node on /export/home directory of every other nodes of cluster with private ip address ranging from 192.168.2.0 to 192.168.2.255

Page 28: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :NAS imp. directories

$ls -l /export/total 4drwxr-xr-x 25 root root 4096 Apr 17 09:47 data1$ls -l /export/data1/total 128drwx------ 18 amjad amjad 4096 Mar 30 09:52 amjad -rw------- 1 root root 9216 Apr 17 09:47 aquota.group -rw------- 1 root root 9216 Apr 17 09:52 aquota.userdrwx------ 7 asgerali asgerali 4096 Apr 2 11:56 asgeralidrwx------ 6 avinash avinash 4096 Apr 17 10:00 avinash drwx------ 6 ayan ayan 4096 Jan 23 11:05 ayandrwx------ 12 bharat bharat 4096 Apr 16 13:38 bharatdrwx------ 10 halbe halbe 4096 Apr 11 21:57 halbedrwx------ 12 krish krish 4096 Mar 28 12:23 krishdrwx------ 2 root root 16384 Jan 5 22:05 lost+founddrwx------ 25 nileshjrane nileshjrane 4096 Apr 18 22:11 nileshjrane drwx------ 11 nitin nitin 4096 Mar 29 00:58 nitindrwx------ 23 pankaj pankaj 4096 Apr 16 13:27 pankaj drwx------ 4 prasham prasham 4096 Jan 28 21:08 prasham

Page 29: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :frontend imp. directories

$ls -l /export/total 4drwxr-xr-x 9 root root 4096 Apr 16 11:53 appsdrwxr-xr-x 4 biouser root 4096 Jan 6 18:19 biodrwxr-xr-x 25 root root 4096 Apr 17 09:47 homedrwx------ 2 root root 16384 Jan 6 17:40 lost+founddrwxr-xr-x 3 root root 4096 Jun 18 2009 rocksdrwxr-xr-x 3 root root 4096 Jan 6 18:08 site-roll$ls -l /export/home/total 128drwx------ 18 amjad amjad 4096 Mar 30 09:52 amjad-rw------- 1 root root 9216 Apr 17 09:47 aquota.group-rw------- 1 root root 9216 Apr 17 09:52 aquota.userdrwx------ 7 asgerali asgerali 4096 Apr 2 11:56 asgerali$ ls -l /export/apps/total 20drwxr-xr-x 3 root root 4096 Jan 21 00:25 modulesdrwxr-xr-x 3 root root 4096 Apr 16 11:54 mpich2drwxr-xr-x 4 root root 4096 Jan 13 00:47 old.pgi-9.0.4drwxr-xr-x 3 root root 4096 Feb 17 09:50 openmpidrwxr-xr-x 4 root root 4096 Mar 11 15:47 pgi-7.2.4

Page 30: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Adding user and setting user home directories

For creating user account carry out the following steps as root login to frontend:

First run the commands#rocks set attr Info_HomeDirSrv nas-0-0.local#rocks set attr Info_HomeDirLoc /export/data1

Add user with the following command#adduser <username>

After user addition task is over, to synchronise the user information throughout all the nodes of cluster, run the command#rocks sync users

To remove user account, run the commands:#umount /export/home/<username> (as root on frontend)#userdel <username> (as root on NAS)

Page 31: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Adding roll to running cluster

# rocks add roll /path/to/<roll_name>.iso# rocks enable roll <roll_name># rocks create distro# rocks run roll <roll_name> | sh# reboot

After the the frontend comes back up you should do the following to populate the node list:# rocks sync configthen kickstart all your nodes# tentakel /boot/kickstart/cluster-kickstartAfter the nodes are reinstalled they should automatically pop up in the queueing system.

For adding roll to already running cluster run the following commands on frontend as a root

Page 32: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Adding packages to the cluster(compilers/softwares)

All the packages has to be installed in /share/apps. This directory is the default NFS directory of frontend, mounted on all other nodes of the cluster.1) First example: Installation of pgiDownload tar package file from PGI site and run the following cmds.#tar zxvf pgi-7.2.4.tgz#cd pgi-7.2.4#./configure --prefix=/share/apps/pgi-7.2.4#make#make install This will install PGI-MPICH compiler in /share/apps directory of frontend and thus the compiler can be used on all the compute nodes Copy the license file into /share/apps/pgi-7.2.4/

Page 33: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Adding packages to the cluster II(compilers/softwares)

Add following lines in bash script file for setting up environment variable:

export PATH=/share/apps/pgi-7.2.4/linux86-64/7.2-4/bin:$PATHexport PGI=/share/apps/pgi-7.2.4

export LM_LICENSE_FILE=$PGI/license.datexport LD_LIBRARY_PATH=/share/apps/pgi-7.2.4/linux86-64/7.2-4/lib:$LD_LIBRARY_PATH

2) Second Example: Configuring OPENMPI with PGI

Download tar packaged file from http://www.open-mpi.org/ and

run the following commands#tar xvzf openmpi-1.3.tgz#cd openmpi-1.3#./configure CC=pgcc CXX=pgCC F77=pgf77 F90=pgf90 --prefix=/share/apps/openmpi/pgi/ --with-tm=/opt/torque#make#make install

Page 34: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Adding packages to the cluster III(compilers/softwares)

Add following lines in bash script file:export PATH=/share/apps/openmpi/pgi/bin:$PATHexport LD_LIBRARY_PATH=/share/apps/openmpi/pgi/lib:$LD_LIBRARY_PATH

Page 35: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Commands to compile and execute mpi codes For mpich-pgi shared library compilier use (fortran,C,C++) Compiling: $/share/apps/pgi-7.2.4/linux86-64/7.2/mpi/mpich/bin/mpif90 code.f -o code.exe Executing: $/share/apps/pgi-7.2.4/linux86-64/7.2/mpi/mpich/bin/mpirun -np <number> code.exe

For openmpi-pgi shared library compilier use (fortran,C,C++) Compiling: $/share/apps/openmpi/pgi-7.2.4/bin/mpif90 code.f -o code.exe Executing: $/share/apps/openmpi/pgi-7.2.4/bin/mpirun -np <number> code.exe

Page 36: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Commands to submit job in interactive mode or batch mode

In interactive mode:$qsub -I -q <dual or quad> -l nodes=<no. of nodes>:ppn=<4 or 8>

In batch mode:$qsub job.sh

Content of job.sh:$cat job.sh#!/bin/sh

#PBS -q quad -l nodes=2:ppn=8,walltime=12:00:00

code=code.exe#==================================# Dont modify below lines#==================================cd $PBS_O_WORKDIRecho `cat $PBS_NODEFILE` > host/usr/bin/killsh $code $PBS_O_WORKDIR

/share/apps/pgi-7.2.4/linux86-64/7.2/mpi/mpich/bin/mpirun -machinefile $PBS_NODEFILE -np `cat $PBS_NODEFILE | wc -l` ./$code

Page 37: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Quota Setup

Quota setup has to configured in I/O node only.

#quotacheck cvguf /home

#quotaon /home

#edquota <username>

Page 38: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Queue Setup on frontend

#qmgrQmgr:create queue dualQmgr:set queue dual queue_type = ExecutionQmgr:set queue dual acl_host_enable = FalseQmgr:set queue dual acl_hosts = compute-0-0Qmgr:set queue dual acl_hosts += compute-0-1Qmgr:set queue dual resources_default.walltime = 12:00:00Qmgr:set queue dual enabled = TrueQmgr:set queue dual started = True

Qmgr:create queue routeQmgr:set queue route queue_type = RouteQmgr:set queue route route_destinations = quadQmgr:set queue route route_destinations += dualQmgr:set queue route enabled = FalseQmgr:set queue route started = TrueQmgr:exit

Page 39: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Check current queue configuration

#qmgr -c 'p s'## Create queues and set their attributes.### Create and define queue route#create queue routeset queue route queue_type = Routeset queue route route_destinations = quadset queue route route_destinations += dualset queue route enabled = Falseset queue route started = True## Create and define queue quad#create queue quadset queue quad queue_type = Executionset queue quad acl_host_enable = Falseset queue quad acl_hosts = compute-1-9set queue quad acl_hosts += compute-1-8set queue dual resources_default.walltime = 12:00:00set queue dual enabled = Trueset queue dual started = True

## Create and define queue dual#create queue dualset queue dual queue_type = Executionset queue dual acl_host_enable = Falseset queue dual acl_hosts = compute-0-9set queue dual acl_hosts += compute-0-18set queue dual acl_hosts += compute-0-8set queue dual resources_default.walltime = 12:00:00set queue dual enabled = Trueset queue dual started = True

## Set server attributes.#set server scheduling = Trueset server acl_host_enable = Falseset server acl_hosts = hyperx.aero.iitb.ac.inset server managers = [email protected] server managers += [email protected] server default_queue = routeset server log_events = 511set server mail_from = admset server query_other_jobs = Trueset server scheduler_iteration = 600set server node_check_rate = 150set server tcp_timeout = 6set server next_job_number = 3413

Page 40: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Building of rocks cluster :Check current queue configuration

Reboot hyperx rocks run hosts “reboot”

Page 41: Date: 7/04/2010 Department of Aerospace Engineering Indian Institute of Technology Bombay

Thank You