gce data grid
TRANSCRIPT
PowerPoint
DATE:10/4/06
GCE - DataGrid enabled via SRB
Ceasar Chen-Kai Sun Email: [email protected]
Outline
Goal
GCE SRB
Feature
Service stack:
Reliability
(Logic Resource)
Scalability
(Access data)
PerformanceReliability
MCAT-SRB(Multi-MES)MCAT DB
GSI-enable
Outline(cont')
Implement
Infrastructure
MCAT-SRB, noMCAT-SRB
MCAT DB backup
GSI
Resource
Physical resource
Logical resource
Auto multi-replica for data
How to
Add as a SRB/Resource node
Use SRB resource within GCE computing
SRB serviceMCAT-serviceSRB service layer
Use multi-server provide
master-mcatmcat-DBSlave mcatDB Backup Server...other slave mcatBackupodbc connection
nonMCAT-server
mcat-connection
nonMCAT-servernonMCAT-serverComputing nodeComputing nodeComputing nodeComputing nodedata-servicedata-servicedata-serviceInfrastructure
SRB service layer(cont's)
SRB service
Use multi-server to provide the same service
Both at MES and non-MES
Backup MCAT DB periodically
GSI Enable
Use Master/Slave MES to provide mcat service
Load balance
Failed over
http://www.sdsc.edu/srb/index.php/Master_Slave_MCAT
Some feature has not been Implemented yet
Infrastructure
Data Resource layer
SRB host AA.inputA.archSRB host BSRB host CC.archB.inputB.archSRB host DPhysicalresourceLogicalresourcegce-archgce-uploadB.input
Computing nodeDownload dataUpload datadataSyncResource
Data Resource layer (cont's)
Logical resource
source type logical
Load balance
Clients never mind which the physical data be stored
Always uses Logical resource
Scalability
Easy to add/remove a physical resource as a special function (Ex: Arch , data-upload,...)
Data Access Issue
Data synchronization between resource
Use data multi-replica mechanism to synchrony data between all logical resources
GET/PUT priority
Resource
Data Resource layer (cont's)
Access Issue:
The "GET" priority between resource
Always pick copy stored in cache/temporary class first andarchival/permanent class last
Within the same class, pick the copy that is stored on the same host as the host where is client is connected to first
If i) and ii) are the same, then pick one randomly
The "PUT" priority between resource
resource name (-S Senv )
Logical resourceclient SRB server physical Resource
http://www.sdsc.edu/srb/index.php/Advanced_Scommands
Resource
Why
Reliability :Mutli-replica data
Access Performance: Nearest accessibility
How
SrsyncSbkupsrb
Logical resource(LR)
Procedure
Upload new object via using LR-upload
Make multi-replica data in LR-arch via using Sbkupsrb
physical resource
Client download data and do the computing
Upload update and new data as step 1
The step 2 need be executed at SRB system automatically
Auto multi-replica mechanism
Auto multi-replica mechanism (Sample)
The initial upload data
$ Sls -al/home/srbadmin.gce-srb: srbadmin 0 gcenode1.upload 128 2006-09-26-17.01 % input.dat srbadmin 0 gcenode1.upload 620 2006-09-26-17.00 % pika191.ifconfig.txt
auto multi-replica
$ Sls -al/home/srbadmin.gce-srb: srbadmin 0 gcenode1.upload 128 2006-09-26-17.01 % input.dat srbadmin 1 pika191.storage 128 2006-09-26-17.08 % input.dat srbadmin 2 gce-ws.storage 128 2006-09-26-17.08 % input.dat srbadmin 0 gcenode1.upload 620 2006-09-26-17.00 % pika191.ifconfig.txt srbadmin 1 pika191.storage 620 2006-09-26-17.08 % pika191.ifconfig.txt srbadmin 2 gce-ws.storage 620 2006-09-26-17.08 % pika191.ifconfig.txt
auto multi-replica LR PR GET
DB service
PostgreSQL: gridportal
Backup DB: gce-ws
SRB Service
MES(Mcat Enabled SRB): gce-ws, pika191
NonMES: gcenode1
GSI DN:
C=tw, O=nchc, OU=Grid, CN=srb-gsi/gce-ws.nchc.org.tw
Resource
PR: gce-ws.storage,gce-ws.upload, pika191.storage, pika191.upload, gcenode1.storage, gcenode1.upload
LR:
gce-arch : gce-ws.storage, pika191.storage, gcenode1.storage
gce-upload: gce-ws.upload, pika191.upload,
Real GCE-SRB Environment
How to use SRB resource in GCE
Install sMover in each computing node
Download URL/Run install.sh
Must be Globus enabled
sMover function
Check out data
Check in data
Procedure
Get GCE-SRB need parameter from portal
Make sure to initial user's GSI proxy
Run data check-out before real computing
Run computing
Run data check-in after real computing
sMover check-in data
sMover get , put, ci, co
sMover check in datagce-upload resource
/opt/sMover/bin/sSmover "[srb_serve]" "[port]" "[account]" "[domain]" "[resource]" "ci" "[local_path]" "[server_path]" "[server_DN]"
gce gce-ws SRB check in data
$ /opt/sMover/bin/sSmover "gce-ws.nchc.org.tw" "5544" "srbadmin" "gce-srb" "gce-upload" "ci" "/home/gceuser/gce-run" "/gce-run" "/C=tw/O=nchc/OU=Grid/CN=srb-gsi/gce-ws.nchc.org.tw"
gce-srb
$ Sls -al /home/srbadmin.gce-srb/gce-run
/home/srbadmin.gce-srb/gce-run:
srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat
srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh
srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat
auto multi-replica
$ Sls -al /home/srbadmin.gce-srb/gce-run
/home/srbadmin.gce-srb/gce-run:
srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat
srbadmin 1 gce-ws.storage 296 2006-09-26-18.09 % input.dat
srbadmin 2 pika191.storage 296 2006-09-26-18.09 % input.dat
srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh
srbadmin 1 gce-ws.storage 620 2006-09-26-18.09 % my-run.sh
srbadmin 2 pika191.storage 620 2006-09-26-18.09 % my-run.sh
srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat
srbadmin 1 gce-ws.storage 1997 2006-09-26-18.09 % output.dat
srbadmin 2 pika191.storage 1997 2006-09-26-18.09 % output.dat
sMover check-out data
sMover check-out data
gce-arch resource
/opt/sMover/bin/sSmover "[srb_serve]" "[port]" "[account]" "[domain]" "[resource]" "co" "[local_path]" "[server_path]" "[server_DN]"
gce gcenode1 SRB check out data
$ /opt/sMover/bin/sSmover "gcenode1.nchc.org.tw" "5544" "srbadmin" "gce-srb" "gce-arch" "co" "/home/gceuser/gce-run" "/gce-run" "/C=tw/O=nchc/OU=Grid/CN=srb-gsi/gce-ws.nchc.org.tw"
gce-srb
$ Sls -al /home/srbadmin.gce-srb/gce-run
/home/srbadmin.gce-srb/gce-run:
srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat
srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh
srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat
auto multi-replica
$ Sls -al /home/srbadmin.gce-srb/gce-run
/home/srbadmin.gce-srb/gce-run:
srbadmin 0 gcenode1.upload 296 2006-09-26-18.07 % input.dat
srbadmin 1 gce-ws.storage 296 2006-09-26-18.09 % input.dat
srbadmin 2 pika191.storage 296 2006-09-26-18.09 % input.dat
srbadmin 0 gcenode1.upload 620 2006-09-26-18.07 % my-run.sh
srbadmin 1 gce-ws.storage 620 2006-09-26-18.09 % my-run.sh
srbadmin 2 pika191.storage 620 2006-09-26-18.09 % my-run.sh
srbadmin 0 gcenode1.upload 1997 2006-09-26-18.07 % output.dat
srbadmin 1 gce-ws.storage 1997 2006-09-26-18.09 % output.dat
srbadmin 2 pika191.storage 1997 2006-09-26-18.09 % output.dat
(with Computing )
Computing node
Globus enable : For gce , it needs signed with NCHC CA
sMover: http://gridportal.nchc.org.tw/gce-SRB_Installation/srbSmover.i386.20060919.tgz
Co -> do computing -> ci
sMover
defaultArchR=gce-arch ()
defaultZone=gcezone
"[srb_serve]" : gce-ws | pika191 | gcenode1 :
"[port]" : 5544
"[account]" : SRB , come from portal ()
"[domain]" : gce-srb
"[resource]" : gce-upload
"ci|co" : check in or check out
"[local_path]" :
"[server_path]" : SRB user home /home/account.gce-srb/xxx
"[server_DN]" : /C=tw/O=nchc/OU=Grid/CN=srb-gsi/gce-ws.nchc.org.tw ? Fix or not ?
check -in
(with Portal )
gce
(From config file , ...)
SRB server
Data-management portel
File upload
File management
How to add as a SRB node
http://gridportal.nchc.org.tw/gce-SRB_Installation/
GSI
()
System admin
Add location
Add physical resource
Add PR into LR (logical resource)
Reference
SDSC SRB
http://www.sdsc.edu/srb/
Scommands for advanced users
http://www.sdsc.edu/srb/index.php/Scommands#Advanced_Users
Mcat SRB GSI-enable Server
http://www.ceasar.tw/modules/news/article.php?storyid=9
How to use Scommands on SRB server
http://www.ceasar.tw/modules/news/article.php?storyid=8