site report: tokyo tomoaki nakamura icepp, the university of tokyo 2014/12/10tomoaki nakamura1

35
Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10 Tomoaki Nakamura 1

Upload: ethelbert-barker

Post on 11-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Site report: Tokyo

Tomoaki Nakamura

ICEPP, The University of Tokyo

2014/12/10 Tomoaki Nakamura 1

Page 2: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Update from the last year

2014/12/10 Tomoaki Nakamura 2

No HW upgrade from the last year for Grid resources- 2560 CPU cores (18.03 HS06/core)- RAM (2GB/core for 1280CPU, 4GB/core for 1280CPU)- No memory upgrade until the end of 2015 (considered at last year)- 2000PB for pledged Disk (2014) and ~600TB for LocalGroupDisk

All service instance have been migrated to EMI3- CREAM, DPM, BDII (site/top), Arugus, gLexec-WN, APEL - WMS, LB, MyProxy: can be decommissioned for ATLAS

The other service instance- perfSONAR (latency 1G, bandwidth 1G, bandwidth 10G)- Squid (condDB x 2 + CVMFS x 2)

Services for ATLAS have been deployed- DPM-WebdDAV: used for Rucio renaming, will be used for central deletion- DPM-XrootD and FAX setup: connected with Asia redirector- Multi core queuex: 512 cores, 20% of resources, 64 static 8-core slots

Page 3: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

FAX remote access

2014/12/10 Tomoaki Nakamura 3

4TB / day = ~46 MB / sec

Page 4: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

ASAP (all data)

2014/12/10 Tomoaki Nakamura 4

(ATLAS Site Availability Performance)

99.77%

Page 5: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Pledge for the next year and beyond

2014/12/10 Tomoaki Nakamura 5

2013 2014 2015

CPU pledge

16000 [HS06]

20000 [HS06] 24000

CPU deployed

43673.6 [HS06-SL5] (2560core)

46156.8 [HS06-SL6] (2560core)

-

Disk pledge 1600 [TB] 2000 [TB] 2400 [TB]

Disk deployed 2000 [TB] 2000 [TB] -

For FY2015- Increase 400TB to pledge- 528TB (8 servers) will be added to DPM by the end of Mar. 2015- Total DPM capacity: 3168TB (~750TB for LocalGroupDisk)

End of 2015 - End of this system- Procurement work will start from the next spring- If we can get 6TB HDD, total storage capacity can be doubled at 4th system

Page 6: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

International network for Tokyo

2014/12/10 Tomoaki Nakamura 6

TOKYO

ASGC

BNL

TRIUMF

NDGF

RALCCIN2P3CERNCANFPIC

SARANIKEF

LA

PacificAtlantic

10Gbps

10Gbps

WIX

New line (10Gbps)since May. 2013

OSAKA

40Gbps

10x3 Gbps

10x3 Gbps

10 Gbps

Amsterdam

Geneva

Dedicated line

Frankfurt

Page 7: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Configuration for the LHCONE evaluation

2014/12/10 Tomoaki Nakamura 7

MLXe32 (10G)

Dell8024 (10G)

Dell 5448 (1G)

Catalyst 6500 (10G)

Catalyst 3750 (10G)

NY

DC

LA

Dell8024 (10G)

UI (Gridftp)

perfSONAR(Latency)

perfSONAR(Bandwidth)

perfSONAR (Latency/Bandwidth)

UI (Gridftp)

ICEPP (production)157.82.112.0/21

UTnet SINET

IPv4/v6

LHCONE BGP peering

ICEPP (LHCONE evaluation)157.82.118.0/24

10Gbps 1Gbps

Page 8: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Stability on packet loss (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 8

Directly affect to transfer rate.

Page 9: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Fraction of packet loss (NY vs. DC)

2014/12/10 Tomoaki Nakamura 9

Comparable each other.

Page 10: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Minimum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 10

Useful to know the typical latency and stability.

Page 11: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Minimum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 11

Originating from other group in Univ. of Tokyo.

Page 12: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Distribution of Minimum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 12

Page 13: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Distribution of Minimum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 13

originating from other group.miss measurement.

Page 14: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Maximum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 14

Useful to find problems.

Page 15: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Maximum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 15

Also have spikes.Additional periodic noise.

Page 16: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Distribution of Maximum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 16

Page 17: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Distribution of Maximum latency (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 17

Discrepancy due to the periodic noise.

Page 18: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Also for the other sites

2014/12/10 Tomoaki Nakamura 18

(US)

(FR)

• One of the perfsonar instance in Tokyo seems to fall into the busy state once in a day.

• It is independent of source sites.

• But, no significant errors in system and service logs.

Page 19: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Maximum latency (masked by time)

2014/12/10 Tomoaki Nakamura 19

Periodic nose can be cleaned up.

Page 20: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Maximum latency by mask (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 20

Still remaining, but comparable.

Page 21: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Bandwidth measurement (CC-IN2P3 and CNAF)

2014/12/10 Tomoaki Nakamura 21

Asymmetric~38 MB/s (incoming)~28 MB/s (outgoing)

Symmetric, but unstable ~34 MB/s (incoming)~35 MB/s (outgoing)

Page 22: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Minimum latency (CC-IN2P3 in 2014)

2014/12/10 Tomoaki Nakamura 22

Page 23: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Minimum latency (CC-IN2P3 in 2014)

2014/12/10 Tomoaki Nakamura 23

Spikes were gone.

Average value is split.

Page 24: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Latency in one day (CC-IN2P3)

2014/12/10 Tomoaki Nakamura 24

Both production line via NY

Incoming

Outgoing

Load balancing somewhere in NY or GEANT?

Page 25: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Maximum latency (CC-IN2P3, 2014)

2014/12/10 Tomoaki Nakamura 25

Some improvement in FR-Geneva?

Page 26: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Bandwidth measurement (latest data)

2014/12/10 Tomoaki Nakamura 26

Still asymmetric~35 MB/s (incoming)~24 MB/s (outgoing)

Symmetric, and very stable ~32 MB/s (incoming)~30 MB/s (outgoing)

Page 27: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Configuration for the LHCONE evaluation

2014/12/10 Tomoaki Nakamura 27

MLXe32 (10G)

Dell8024 (10G)

Dell 5448 (1G)

Catalyst 6500 (10G)

Catalyst 3750 (10G)

NY

DC

LA

Dell8024 (10G)

UI (Gridftp)

perfSONAR(Latency)

perfSONAR(Bandwidth)

perfSONAR (Latency/Bandwidth)

UI (Gridftp)

ICEPP (production)157.82.112.0/21

UTnet SINET

IPv4/v6

LHCONE BGP peering

ICEPP (LHCONE evaluation)157.82.118.0/24

10Gbps 1Gbps

Page 28: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

LHCONE (EU sites) for all production servers

2014/12/10 Tomoaki Nakamura 28

MLXe32 (10G)

Dell8024 (10G)

Dell 5448 (1G)

Catalyst 6500 (10G)

Catalyst 3750 (10G)

NY

DC

LA

Dell8024 (10G)

UI (Gridftp)

perfSONAR(Latency)

perfSONAR(Bandwidth)

perfSONAR (Latency/Bandwidth)

UI (Gridftp)

ICEPP (production)157.82.112.0/21

UTnet SINET

IPv4/v6

LHCONE BGP peering

ICEPP (LHCONE evaluation)157.82.118.0/24

10Gbps 1Gbps

Page 29: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Nov. 11, 2014 (latency for CCIN2P3)

2014/12/10 Tomoaki Nakamura 29

Page 30: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Nov. 11, 2014 (latency for CNAF)

2014/12/10 Tomoaki Nakamura 30

Page 31: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Nov. 11 (throughput for CCIN2P3)

2014/12/10 Tomoaki Nakamura 31

Page 32: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Nov. 11 (throughput for CNAF)

2014/12/10 Tomoaki Nakamura 32

Page 33: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Dec. 7, 2014 (incoming B.W. is saturated)

2014/12/10 Tomoaki Nakamura 33

User subscription of AOD via DaTri physics.Egampa, 8TeV all period: ~150TB

Still on going today (continuously several days)

Page 34: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Breakdown from GridFTP log

2014/12/10 Tomoaki Nakamura 34

Part of LHCONE contribution

Mainly FTS3 and direct transfer from multiple sites

10 min. bin

1 min. bin

Page 35: Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1

Near future and Concerns

2014/12/10 Tomoaki Nakamura 35

LHCONE- Next for US and Canada- And then, for Asisa (ASGC, IHEP)

Network Bandwidth- 2015: more 10G from ICEPP to SINET? UTokyo is offering, but depends on

them.- JFY2016: SINET will be upgraded (SINET5)

• 100G for US (LA)• 20G for EU (reverse around)

EMI3- End of full support April 30, 2014- End of standard update October 31, 2014- End of security update April 30, 2015

Batch job systemTroque/Maui, no more support, not effective dynamic multi-core allocationHTCondor, SLURM or the other commercial product (UNIVA GE, LSF)