distributed systemshomes.sice.indiana.edu/prateeks/ds/lec1.pdf · 2019-01-09 · welcome! •what...

29
Distributed Systems CSCI-B 534/ENGR E-510 Spring 2019 Instructor: Prateek Sharma

Upload: others

Post on 02-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Distributed SystemsCSCI-B 534/ENGR E-510

Spring 2019Instructor: Prateek Sharma

Page 2: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Welcome!• What is a distributed system? • Where are distributed systems found?• Why you should take this course? • Small taste of challenges in distributed systems• Course contents, outline, structure, etc.

Page 3: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

What Is A Distributed System?• Collection of autonomous computing elements that appears to its users as a

single coherent system• Computing elements: hardware devices or software processes • Single coherent system: Users and applications perceive a single system

Nodes / Servers / Processes

Communication link

Page 4: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Where Distributed Systems Fit In

Computer Architecture

Operating Systems

Computer Networks

Distributed Systems

Web Services (e-commerce)

“Big Data” Processing

Cloud Storage

Machine Learning

Internet of Things

Abstraction Level

Page 5: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Distributed Systems Are Everywhere• Large-scale Internet Services

• Web clusters for high-traffic websites

• Cloud storage • Dropbox, Google Drive,...

• Large-scale data processing• Map-reduce to process TB’s of data

• Graph processing • Social network analysis

• Large-scale machine learning• Model training and inference

• Sensor Networks • Internet of Things

• Modern multi-core architectures

Page 6: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Large-Scale Distributed Systems

• Conventionally: application deployed on a single server • Warehouse-scale computing: meet increasing computing needs of applications• How to handle computing, storage, and networking needs of millions of users?

Page 7: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Challenges In Distributed Systems• Nodes are independent and can be uncoupled or only loosely coupled• No global clock!

• How to order events within a distributed system?

• How to organize a distributed system? • Node membership: fixed, dynamic

• Communication between nodes• Structured: Each node has well-defined set of neighbors (tree, ring)• Unstructured: Nodes can communicate with any other node

• Fault-tolerance: Nodes can fail in multiple ways • Data consistency• Performance

Page 8: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Designing Distributed Software Systems• Scalability: Handle massive changes in user requests (10X)• Performance• Reliability and Resiliency: Handle partial failures gracefully• Usability: Abstractions and Interfaces • Jeff Dean: Building Software Systems At Google and Lessons Learned

https://www.youtube.com/watch?v=modXC5IWTJI

Page 9: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Coherent Systems• Collection of nodes as a whole appears the same to the end-user • Single coherent system: Users/applications perceive a single system • Users cannot determine location of computation/data

• Or details about the data replication

Nodes

Page 10: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Some Challenges In Distributed Systems1. Distributed reads and writes2. How to build distributed systems --- Middleware 3. Two Generals Problem

Page 11: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Conventional Program Semanticsdef foo():X=0 ;X=1 ;print(X); 

• We expect X==1, because that was the last write • Writes take effect “in order” of their issue

• Aka “Strong consistency”

Page 12: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Distributed Reads and Writes

• How to ensure that data written can be retrieved from any server?

Page 13: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Distributed Reads and Writes

• How to ensure that data written can be retrieved from any server? • Replication!• Broadcast values after a write

Page 14: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Tradeoffs In Distributed Systems• Previous consistency example:• Sacrifice latency for “strong consistency”• Or, have a “looser” consistency for lower latency• Another classic tradeoff is Cost vs. Performance:

Low Cost, High Performance(Doesn’t exist)

High Cost, High Performance

Low Cost, Low Performance High Cost, Low Performance(Undesirable)

Page 15: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Middleware: The OS of Distributed Systems• Commonly used components and functions for distributed applications

Page 16: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Middleware Goals• Resource Sharing• Distribution Transparency• Openness (Other nodes can join)• Scalability

Page 17: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Two Generals Problem• Two Roman Generals want to co-ordinate an attack on the enemy

• Both must attack simultaneously. Otherwise, both will lose

• Only way to communicate is via a messenger • But messengers can get captured/lost. • Perfectly-reliable communication system not available Task: Design a protocol

that ensures the two generals always attack simultaneously

Page 18: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Impossibility Proof of Two Generals Problem• Claim: There is no non-trivial protocol that guarantees that the two generals

will always attack simultaneously • Proof by induction on the number of messages• Let d messages be delivered at the time of attack• Base case: d=0. Claim holds (Impossible without any delivered messages)• Suppose impossibility claim holds for d=n. Then, we’ll show for d=n+1• Consider message n+1

• Sender attacks without knowing if message is delivered or not • Receiver must then attack too, even if msg not received • So the last message (n+1) was irrelevant, and n messages suffice • But that’s a contradiction: since n+1 was supposed to be the smallest number of messages

Page 19: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Common Knowledge• Solving the Two Generals Problem requires common knowledge • Common knowledge cannot be achieved with unreliable communication

channels

Page 20: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

What Is This Course About?

Page 21: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Course Outline Major theme: Combine fundamentals of distributed computing with real-world systems and applications 1. Architecture of Distributed Systems 2. Components of distributed systems: processes, threads, and virtualization3. Communication in distributed systems and RPC’s4. Ordering of events, logical clocks [theory]5. Vector clocks and their applications6. Distributed data processing (Map-Reduce) 7. Mutual exclusion8. Leader election9. Global state and snapshots

Page 22: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Course Outline Continued10. Data consistency and CRDT’s11. Distributed Caching12. Key-value stores (Riak, Voldemort, Redis)13. Storage: NFS14. Storage: CAP theorem 15. Consensus: Raft, Paxos16. Fault-tolerance: Spark 17. Distributed ML 18. Cluster Scheduling 19. Block Chains

Page 23: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Course Prerequisites • Distributed Systems is fundamentally an advanced topic • Even “easy” problems in conventional non-distributed computing are hard or

even impossible in distributed settings • We will focus both on rigorous distributed algorithms and engineering large

distributed systems

•“You can have a second computer if you can show you know how to use the first one”---Paul Barham

• You must be proficient in:• Systems and network programming

Processes, threads, sockets, file-handling, basic UNIX IPC (pipes, etc.)• Algorithmic thinking and discrete math proof techniques (such as induction)

Page 24: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Logistics

Page 25: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Course Web-pagehttp://homes.sice.indiana.edu/prateeks/dist-sys-course.html

Page 26: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Course Structure and Text Book• Lectures: Tuesday and Thursday --- 1pm to 2.15 pm • Questions encouraged!

• Points for class participation

• Readings assigned for each lecture• Read before coming to class!

• Text-book: “Distributed Systems”. Maarten van Steen and Andrew Tanenbaum• Soft-copy available on the web

• Most lectures will also discuss research papers • Reference book for distributed algorithms:

• “Elements of Distributed Computing”. Vijay Garg

Page 27: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Evaluation Components• Programming Assignments [20%]

• 2—4 spread throughout the semester• Usually language agnostic (choose any popular programming language of your choice)• Poll: Python, C++, Java, Go, Rust

• Homework [10%]• Exams [30%]

• Mid-term(s)

• Final Project [30%]• Build a distributed system. End-goal is to have a conference-style paper • Groups of 3

• Class Participation: Quizzes etc. [10%]

Page 28: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Resources• http://homes.sice.indiana.edu/prateeks/dist-sys-course.html• Check course webpage frequently!• Office hours:

• Prateek: Thu 3—4 in Luddy Hall 4126• Vibhatha: TBA

• Canvas:• Discussions • Submitting assignments

Page 29: Distributed Systemshomes.sice.indiana.edu/prateeks/ds/Lec1.pdf · 2019-01-09 · Welcome! •What is a distributed system? •Where are distributed systems found? •Why you should

Next Time• Distributed systems building blocks

• Refresher on Operating System Processes and Threads