aran bergman eddie bortnikov, principles of reliable distributed systems, technion ee, spring 2006 1...

19
Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation 5: Byzantine Synchronous Consensus Spring 2009 Alex Shraer

Post on 20-Dec-2015

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20061

Principles of Reliable Distributed Systems

Recitation 5:

Byzantine Synchronous Consensus

Spring 2009

Alex Shraer

Page 2: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20082

Byzantine Synchronous Consensus

נדיח את מרינה

נדיח את גיא

מרינה

גיא

Page 3: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 20083

Model• Round-based synchronous

– Send messages to any set of processes; – Receive messages from this round; – Do local processing (possibly decide, halt)

• Static set P = {p1, …, pn} of processes

• t-out-of-n Byzantine (arbitrary) failures• Authentication• Messages between correct processes cannot be lost

Page 4: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20064

Validity and Byzantine Failures• Validity – Decision is input of one process

• Why is that a problem when Byzantine failures can occur?– What is the input of a Byzantine process? Why

would we be ok with deciding on this input– A Byzantine leader can lie about its input

• Strong unanimity - If the input of all correct processes is v then no correct process decides a value other than v

Page 5: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20065

Weak Unanimity

• Weak Unanimity: If the input of all the correct processes is v and no process fails then no correct process decides a value other than v

• We will next see an algorithm for t<n with authentication and Weak Unanimity

Page 6: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20066

Algorithm (for any t<n)• The proposed algorithm is not symmetric

(not all processes use the same rules)– One process, p1 , is defined as the leader

– Leader’s input – v1

• There is a “default” value, known a-priori vdefault{possible decision values}

Page 7: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20067

Algorithm• Process p1: SendBuffer = {v1}; All other procs: SendBuffer = { }

• In every round 1 ≤ k ≤ t+1 do– For every message m in SendBuffer

• Send <m>pi to all the processes that did not sign m

– Clear SendBuffer– Receive round k messages– For every received message m, if m has k different valid

signatures beginning with p1’s• Valid = Valid {v}, where v is the value received in m• SendBuffer = SendBuffer {m}

• if Valid contains exactly one value, decide it else decide vdefault

• Proof of Termination – trivial

In the proof, we will call such messages

legitimate

Page 8: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20068

Proof of Weak Unanimity• Weak Unanimity: If the input of all the correct

processes is v and no process fails then no correct process decides a value other than v

• If p1 (the leader) is correct– All correct processes get v1 in round 1 and insert into Valid– No other value is inserted into Valid

• only messages beginning with p1’s signature are considered• processes cannot forge leader’s signature

– All correct processes decide on v1

• If p1 is not correct – Weak Unanimity requires nothing

Page 9: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 20069

Why don’t we get Strong Unanimity from this algorithm?

• Strong unanimity - If the input of all correct processes is v then no correct process decides a value other than v

• If p1 is Byzantine, it can send the same value to all processes, but this value can be different than that of correct processes

Page 10: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200610

Why do we need t+1 rounds?• Suppose that a correct process receives a value at

the end of the last round and no other correct process has this value…– Can this happen if there are t rounds? – Can this happen if there are t+1 rounds?

Page 11: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200611

Agreement• Lemma: For every two correct processes pi

and pj , if vValuei at the end of round t+1, then vValuej at the end of round t+1– i.e., Valid sets of correct processes are the same

• Then, agreement follows– if the sets are empty or contain more than one

value, every correct process decides vdefault

– Otherwise all correct processes decide on the single value in Valid

Page 12: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200612

Proving the Lemma• Lemma: For every two correct processes pi and

pj , if vValuei at the end of round t+1, then vValuej at the end of round t+1– i.e., Valid sets of correct processes are the same

• Consider a correct process pi

• Suppose that vValuei at the end of round t+1– When was v added to Valuei ? Denote this round by k– There are two cases: k ≤ t and k = t+1 – We need to prove that by the end of round t+1,

vValuej for every correct client pj

– Note: v was a legitimate value when pi received it in round k

Page 13: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006

Proof of Lemma• Case 1: k ≤ t

– Since links from pi to other correct processes cannot lose messages, all correct processes receive v by round k+1 and add it to Valid if its not already there

• Case 2: k = t+1 • The first t processes that signed the message must be faulty

– Otherwise, pi would receive v in an earlier round from a correct process

• the last process p that signed the message is correct– v is a legitimate message received in round t+1, thus all

t+1 signatures on v are different. But there are only t faulty processes

• p received v in round t

• From Case 1 we know that all correct processes receive v by round t+1 and add it to Valid if its not already there

Page 14: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Q1 from HW2 – part (b)

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200614

p1

p2

p3

m1

m2

deliver(m1)

deliver(m1)deliver(m2)

bcast(m1)

bcast(m2)

• Prove that in the absence of failures a broadcast algorithm that guarantees FIFO + Total Order - also guarantees Causal Order

• First, the broadcast must be RELIABLE. Otherwise, the statement is not true. Counter example:

• FIFO is trivially preserved here since each process bcasts only one message

• TOTAL order is trivially preserved, since only one process delivers 2 messages

• Causal is not preserved!

• Is this execution reliable?

Page 15: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Q1 HW2 – part (c)

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200615

• Does this claim hold when there are failures in the system?

• The claim doesn’t hold:

• p3 never delivers m1 - this is allowed by the Validity and agreement of reliable broadcast since p1 faulty. It is also allowed by Agreement of reliable broadcast because p1 and p2 are faulty and therefore p3 can deliver different messages

• Need to explain why 3 properties of Reliable Broadcast preserved

• Need to explain why FIFO and TOTAL order are preserved

• Need to explain why Causal order is violated

p1

p2

p3

m1

m2

bcast(m2)

bcast(m1)

deliver(m2)

Page 16: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Q1 from HW2 – part (a)

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200616

• Suppose that m →m’. This means that bcast(m) → bcast(m’).

• This means that there exists a chain of bcast-deliver events

• Induction on the number deliver events

in the chain from m* to m’.

• If there are 0 deliver events, the claim holds by FIFO

• Assume for k, and lets show for k+1:

• Observe the last sub-chain of the form:

• Since all processes are correct, by Validity

of the Reliable Broadcast, p2 delivers m’. From Integrity, this is after deliver(m1)

From TOTAL, all processes deliver m1 and m’ in the same order as p2.

• m* is delivered before m1 at all processes (induction assumption)

=> m* is delivered before m’ .

If mm*, from FIFO m is delivered before m*

bcast(m)

deliver(m*) bcast(m’’)

deliver(m’’)

bcast(m’)

deliver(m1)bcast(m’)

bcast(m1)p1

p2

bcast(m*)

Page 17: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Q3 from HW21. Initially:

2. TS[j] ← 0 for all 1≤j≤ n /* array of integers */

3. pending ← empty /* set of messages */

4. abcast(msg):

5. TS[i] ← TS[i] + 5

6. bcast( msg, TS[i], i )⟨ ⟩

7. upon recv( msg, ts, j ):⟨ ⟩

8. add( pending, ( msg, ts, j ) )⟨ ⟩

9. TS[j] ← ts

10. TS[i] ← max( TS[i], ts )

11. forever do

12. let ( msg, ts, j ) be the entry in pending with the smallest t, j ⟨ ⟩ ⟨ ⟩

13. if ⟨ ts, j ≤ TS[k], k for all 1≤k≤ n then⟩ ⟨ ⟩

14. remove( pending, ( msg, ts, j ) )⟨ ⟩

15. adeliver( msg )

10 ו 5 שנלמד בכיתה בשורות LTSהאלגוריתם לעיל נבדל מאלגוריתם •

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200617

Page 18: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200618

0

0

0

5

5

5

<5,3

><5,2>

<5,1>

5

<10,

2>

10

5 10 15

<15,

3>

15

15

105 5 15

15

15

15

<15,2>

m2 m1

p1

p2

p3

99

--

--

Delivery according to the new algorithm

Q3 from HW2 – part (a)

Page 19: Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 200619

0

0

0

1

1

1

<1,3

><1,2>

3

Q3 from HW2 – part (c)

The original LTS would deliver m1 at time 7When would it deliver m2?