yuchul yang [email protected] oct. 20.2006 kps 2006 가을 exco, 대구 the current status of korcaf...
TRANSCRIPT
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
The Current Status of KorCAF and CDF Grid
양유철 , 장성현 , 미안 사비르 아메드 , 칸 아딜 , 모하메드 아즈말 , 공대정 , 김지은 , 서준석 , 김동희
( 경북대학교 , 물리학과 )이영장 , 정지은 , 문창성 , 김현수 , 전은주 , 주경광 , 김수봉
( 서울대학교 , 물리학과 )고정환 , 이재승 , 유인태
( 성균관대학교 , 물리학과 )조기현
( 슈퍼컴퓨팅 센터 , KISTI)
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Introduction to CDF Computing
Developed in 2001-2002 to respond to experiments greatly increased need for computational and data handling resources to deal with RunII
One of the first large-scale cluster approaches to user computing for general analysis.
Greatly increased CPU power & data to physicists.
CDF Grid via CAF, DCAF, SAM and SAMGrid ☞ DCAF(DeCentralized Analysis Farm) ☞ SAM (Sequential Access through Metadata) – Real data Handling System ☞ SAMGrid – combination of SAM and JIM (Job Information Management) system
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Outline
CAF Central Analysis Farm :A large central computing resource based on Linux cluster farms with a simple job management scheme at Fermilab.
DCAF
Decentralized CDF Analysis Farm :We extended the above model, including its command line interface and GUI, to manage and work with remote resources
GridWe are now in the process of adapting and converting out work flow to the Grid
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Environment on CAF All basic CDF software pre-installed on CAF Authentication via Kerberos ☞ Jobs are run via mapped accounts with authentication of actual user through special principal ☞ Database, data handling remote usres ID passed on through lookup of actual user via special principal
User’s analysis environment comes over in tarball - no need to pre-register or submit only certain jobs. Job returns results to user via secure ftp/rcp controlled by user script and principal
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
In 2006, about 50% of analysis farm outside of FNAL
Distributed clusters in Korea, Taiwan, Japan, Italy, Germany, Spain, UK, USA and Canada
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Current DCAF approach
Cluster technology (CAF = “Central analysis farm”) extended to remote site (DCAFs = Decentralized CDF analysis Farm)
Multiple batch systems supported : converting from FBSNG system to Condor on all DCAFs
SAM data handling system required for offsite DCAFs
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
http://www-cdf.fnal.gov/internal/fastnavigator/fastnavigator.html (2006/Aug)
Current CDF Dedicated Resources
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
TYPE CPU RAM HDD NO
head Node cluster46.knu.ac.kr
AMD MP2000 * 2 2G 80G 1
sam stationcluster67.knu.ac.kr
Pentium 4 2.4G 1G 80G 1
submission node
cluster52.knu.ac.kr
Pentium 4 2.4G 1G 80G 1
worker nodecluster39~cluster73(21
)cluster102~cluster114(
13)Cluster122~cluster130(
9)Cluster137~cluster139(
3)
(updated 2006)
AMD MP2000 * 2 2G 80G 4
AMD MP2200 * 2 1G 80G 2
AMD MP2800 * 2 2G 80G 11
AMD MP2800 * 2 2G 250G 2
Pentium 4 2.4G 1G 80G 15
Xeon 3.G * 2 2G 80G 9
Xeon 3.G * 2 2G 80G 3
Total 81 CPU (179.9GHz)
79G 4260G 49
Detail of KorCAF resources
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Storage statusCPU RAM HDD NO
Current 0.6TB
Opteron dual 2G 4TB 1
Zeon dual 1G 1TB 1
Total 5.6TB 2
Working on CondorCAF batch system
cdfsoft Installed products : 4.11.1, 4.11.2, 4.8.4, 4.9.1, 4.9.1hpt3, 5.2.0, 5.3.0, 5.3.1, 5.3.3, 5.3.3_nt, 5.3.4, development Installed binary products: 4.11.2, 5.3.1, 5.3.3, 5.3.3_nt, 5.3.4
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
CAF gui & Monitoring SystemSelect farm
Process type
Submit status
User script , I/O file location
Data access
http://cluster46.knu.ac.kr/condorcaf
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Functionality for User (KorCAF)Feature Status
Self-contained user interface
Yes
Runs arbitrary user code Yes
Automatic identity management
Yes
Network delivery of results
Yes
Input and output data handling
Yes
Batch system priority management
Yes
Automatic choice of farm Not yet
Negotiation of resources Not yet
Runs on arbitrary grid resources
Not yet
Grid
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Total CDF Computing Requirements
Input Conditions Resulting Requirements
FiscalYear
Int L Evts Peak rate Ana Reco DiskTape I/O
Tape Vol
fb-1 x 109 MB/s Hz THz THz PB GB/s PB
2003 0.3 0.6 20 80 1.5 0.5 0.2 0.2 0.4
2004 0.7 1.1 20 80 4.0 0.7 0.3 0.5 1.0
2005 1.2 2.4 40 220 7.2 1.0 0.7 0.9 2.02006 2.7 4.7 60 360 16 1.4 1.2 1.9 3.3
2007 4.4 7.1 60 360 26 2.8 1.8 3/0 4.9 Analysis CPU, disk, tape needs scale with number of events.
FNAL portion of analysis CPU assumed at roughly 50% beyond 2005.
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Movement to Grid It’s the world wide trend for HEP experiment.
Need to take advantage of global innovations and resources.
CDF still has a lot of data to be analyzed.
USE Grid
Cannot continue to expand dedicate resource
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Activities for CDF Grid Testing various approaches to using Grid resources (Grid3/OSG and LCG)
Adapt the CAF infrastructure to run on top of the Grid using Condor glide-ins (GlideCAF)
Use direct submission via CAF interface to OSG and LCG
Use SAMGrid/JIM sendboxing as an alternate way to deliver experiment + user software
Combine DCAFs with Grid resources
Oct. 20.2006KPS 2006 가을
EXCO ,대구 YuChul [email protected]
Conclusions
CDF has successfully deployed a global computing environment (DCAFs) for user analysis.
A large portion (50%) of the total CPU resources of the experiment are now provided by offsite through a combination of DCAFs and other clusters. And KorCAF (DCAF in Korea) working on Condor batch system.
Active work is in progress to build bridges to true Grid methods & protocols provide a path to the future.