osiguranje dostupnosti it usluga - radioklub pazin · povijest •1966. – 1990. arpanet (ucla,...
TRANSCRIPT
Osiguranje dostupnosti IT usluga
Petar Koraca
whoami
• VeleRi – informatika, smjer programsko inženjerstvo
• sysadmin @ Infobip, Pula
• Linux/Windows, High Availability, monitoring, virtualizacija, ...
Tema?
• Internet
• Web
• Redundancija • primjeri
• Virtualizacija
Internet
• „Mreža svih mreža”
• Globalni sustav međusobno povezanih mreža
• Odvojene mreže koje ujedinjava IP adresiranje te BGP routing
• TCP/IP protokol
Povijest
• 1966. – 1990. ARPANET (UCLA, Stanford, UCSB, Utah)
• 1982 – TCP/IP postavljen kao standard
• 1985 – 1995 - NSFNET
• 1991. - Sir Tim Berners-Lee (CERN)
– HTTP protokol, Web server, Web browser, HTML
– prva web stranica: info.cern.ch
TCP/IP protokol
• Gornji slojevi su „bliži” aplikaciji, donji fizičkom prijenosu podataka
• 7 slojeva po OSI modelu
Internet routing
• Tier 1 network – AT&T, Level3, Deutsche Telekom, Cogent, Sprint, Tata, ...
– Ne kupuju traffic, posjeduju transkontinentalnu optiku
• Tier 2 network – Vodafone, British Telecom, Interoute, ...
• Tier 3 network
• BGP – Border Gateway Protocol
• IX – Internet Exchange
Traceroute
Tracing route to www.g.ebay.com [66.135.200.181]
1 * * * Request timed out.
2 7 ms 6 ms 7 ms 89.164.98.3
3 8 ms 10 ms 8 ms 82.193.201.33
4 7 ms 10 ms 9 ms 213.147.120.69
7 7 ms 7 ms * 213.147.96.110
8 7 ms 9 ms 8 ms 212.162.29.1
9 38 ms 48 ms 46 ms ae-15-15.ebr2.Berlin1.Level3.net [4.69.151.190]
10 42 ms 49 ms 48 ms 4.69.200.169
11 40 ms 39 ms 48 ms ae-21-21.ebr1.Dusseldorf1.Level3.net [4.69.143.181]
12 39 ms 39 ms 40 ms ae-46-46.ebr3.Frankfurt1.Level3.net [4.69.143.170]
13 39 ms 45 ms 48 ms ae-63-63.csw1.Frankfurt1.Level3.net [4.69.163.2]
14 39 ms 51 ms 39 ms ae-1-60.edge3.Frankfurt1.Level3.net [4.69.154.7]
15 39 ms 39 ms 39 ms sprint-level3-ge.Frankfurt1.Level3.net [4.68.111.142]
16 56 ms 56 ms 57 ms sl-bb21-par-4-0-0.sprintlink.net [213.206.129.149]
17 125 ms 127 ms 125 ms sl-crs1-par-0-8-0-0.sprintlink.net [217.118.224.55]
18 126 ms 125 ms 126 ms sl-crs2-nyc-0-2-2-0.sprintlink.net [144.232.20.45]
19 154 ms 138 ms 137 ms sl-crs2-rly-0-9-0-0.sprintlink.net [144.232.8.244]
20 * 138 ms 157 ms sl-crs2-dc-0-4-0-2.sprintlink.net [144.232.8.164]
22 169 ms * 168 ms 144.232.5.210
23 169 ms 168 ms 168 ms sl-crs4-atl-0-3-0-0.sprintlink.net [144.232.5.115]
25 172 ms 187 ms 172 ms 144.232.11.178
26 * * * Request timed out.
IP adrese
• IPv4 - ~4,3 milijarde IP adresa (109)
• Public – npr. 144.232.20.45
• Private
– 10.x.x.x
– 192.168.x.x
– 172.16.0.0 - 172.31.255.255
• IPv6
Client-Server model
• Arhitektura mreže
• Klijent (korisnik) šalje zahtjeve
• Server (poslužitelj) odgovara na njih
• Web browsing, igre, video streaming, ...
Server - karakteristike
• OS – Linux / Windows / Solaris / BSD / ... • Static IP • Public IP ili NAT – Network Address Translation • Dual Power Supply • UPS - Uninterruptible power supply • Multiple NIC (teaming) • RAID – redundantni diskovi • Out-of-band management port
– BIOS – Power on/off – monitoring – „Vlasničke” implementacije - $$$
• Datacentar – struja, hlađenje, disaster recovery, ISO, PCI certifikati, ...
Server
Domain name
• IP Adresa: 208.117.229.181
• Domain name: www.google.com
• DNS – Domain Name Server
Web 1.0
• Statički sadržaj, jako malo interakcije
• Hyperlink kao poveznica između stranica
• Korisnik može uglavnom pregledavati sadržaj
• Malo prometa
• Koncentriranje na prezentaciju sadržaja
Web 2.0
• User generated content
• Kolaboracijski medij
• Social networking, blog, wiki, video, music, ....
• Facebook, Twitter, Google Services, Netflix, Instagram, ...
• Web API – REST, SOAP, XML i JSON dokumenti
Broj korisnika interneta kroz godine
May, 2012
Web 2.0 numbers
• Twitter – 500M tweetova na dan (6000/sec)
• Netflix – 1/3 ukupnog US bandwidtha
• Salesforce – 1B transakcija na dan
• Facebook - ~500+ M dnevno aktivnih korisnika
• Tumblr – 500M pogleda dnevno, 40k/sec korisničkih zahtjeva
• Banke, burze, ...
Client - Server v1
• 1 web server sa app i bazom
Client - Server v2
• 1 web server – 1 db server
Daljnji zahtjevi
• Uptime
• SLA – Service Level Agreement
• 99,999% - five nines – 5.26 min/year
• 99,99% - four nines – 52.56 min/year
• Backup je nedovoljan
• 2 web servera + 2 baze
• SQL – relacijske baze – Master / Slave
Redundancija
• Aplikacija / servis • Baza podataka
• Datacentar • Struja • Hlađenje • ISP – Internet Service Provider • Mrežna oprema (ruter, firewall, switch, kablovi) • Serveri • Diskovi (RAID) • Storage
Redundancija hardvera
Cluster sustavi, CCERT-PUBDOC-2006-12-176, CARNet
Redundancija servisa
• Load balancing – LVS – HAProxy
• High Availability – Heartbeat, Pulse, Keepalived – DRBD – Distributed Replicated Block Device
• Clustering – Microsoft Failover Cluster
LVS - Piranha
• LVS – Linux Virtual Server • Layer 4 load balancer • Piranha – RedHat implementacija (ipvsadm +
pulse + lvsd) – mrtav projekt • Free & Open Source • Direct Routing • Sinkronizacija konekcija • Active/standby • Virtual IP – vezan za Virtual Service • Real server – back end serveri
LVS Load Balancing
HAProxy
• Free & Open Source
• GitHub, Reddit, Farmville, RedHat, StackOverflow, Tumblr, Twitter, Virgin Airlines, ...
• Layer 7 load balancer – TCP/HTTP
• Potrebno je više resursa nego L4 balancer
• Features: • URL rewriting
• ACL – Access Control List (Front-end, back-end)
• Host header redirecting
Virtualizacija
• Apstrakcija hardvera
• Iskorištavanje resursa računala za više odvojenih okolina
• Stvarni OS, virtualni OS
• Fizičko računalo - host
• Virtual machine – guest
Virtualizacija
Koristi virtualizacije
• Bolja iskoristivost resursa
• Kloniranje
• Snapshots
• (Live) Migration
• Testiranje
• ...
Virtualizatori
• Oracle Virtualbox
• VMware
• Microsoft Hyper-V
• KVM
• Xen
• ...
Primjer HA+LB+virt sustava
Georedundancija
• Razlozi:
– Zahtjevi klijenata
– Latencija
• Anycast – oglašavanje istog range-a IP adresa na više lokacija
• Distribuirani sustavi • Npr. baze podataka
Cloud
• IaaS – Infrastructure as a Service • AWS – Amazon Web Services
• Rackspace
• PaaS – Platform as a Service • Heroku
• SaaS – Software as a Service • GMail
Ispadi
• 2012 – leap second Linux bug • NTP – Network Time Protocol
• LinkedIn, Reddit, Mozilla, Netflix, ...
• Skype – 24h downtime
• ...
Cost of outage: $90,000 per hour in the media sector to about $6.48 million per hour for large online brokerages
The end