meeting 2016 martin grønlien pejcoch computing ... · co/webinars/introduction-elk-stack. rewrite...
TRANSCRIPT
Utskifting av bakgrunnsbilde:
- Høyreklikk på lysbildet og velg «Formater bakgrunn»
- Under «Fyll», velg «Bilde eller tekstur» og deretter «Fil…»
- Velg ønsket bakgrunnsbilde og klikk «Åpne»
- Avslutt med å velge «Lukk»
Computing Representatives' meeting 2016Martin Grønlien Pejcoch
Continuous delivery
4
source: http://derberg.github.io/documentation-continuous-delivery/img/continuous-delivery-cycle.jpg
Post Processing Infrastructure
• our “ecGate”• spread over 2 datarooms• 1PB and 380TB lustre storage• GridEngine
- Operational queue- Research queue
5
Rewrite of the ECPDS receiver 2/3
7
Apache Mesos (with Docker containers running in Marathon)
Cluster manager mesos.apache.org
Productstatus Production overview (DB for production metadata and an API)
github.com/metno/productstatus
Apache Kafka Distributed message broker kafka.apache.org
EVA (EVent Adapter) Listens to Kafka and triggeres actions
github.com/metno/EVA
InfluxDB, Telegraf, Grafana, Kapacitor
Monitoring system influxdata.com, grafana.org
Postprocessing infrastructure similar to ecgate, built around Lustre FS and GridEngine
ELK stack (Elasticsearch, Logstash and KIBANA)
Log handling https://www.elastic.co/webinars/introduction-elk-stack
Rewrite of the ECPDS receiver 3/3
8
Data flow
Data processing
Finished or incoming data processing job
KafkaMessage queueProductstatus
Data processingData processing jobs
Storemetadata
Publish metadata
Distribute published metadata to all
Processing completed
Read additionalmetadata
Author: Kim T. Jensen
MetCoOp
NWP operational cooperation with Sweden
• HPC upgrade every 2 years• Currently running 2.5km Harmonie Arome and 11km Hirlam• Ensamble runs of the same model in test phase
- 9 members- 1 AROME and ALARO control run and 8 AROME members- Control member of the ENS to replace today’s deterministic run- Members distributed on both HPCs (5,4)
• Uses ecFlow on VMs to trigger the model• Postprocessing done separately for each institute (the PPI and
ecFlow at MET Norway)
9
Arome Arctic
• Same model and area size as for the MetCoOp (2.5km Arome)
• ecFlow on MET Norway VMs and PPI used to fetch, postprocess and create products
10
VGL (Access to big amounts of data selectively without having to rewrite all our SW tools)
11
Lustre
VGL client, transferring already rendered context Server
with GPU
Lustre