brewer’s conjecture and the feasibility of cap web services

53
BREWER’S CONJECTURE AND THE FEASIBILITY OF CAP WEB SERVICES (Eric Brewer) Seth Gilbert Nancy Lynch Presented by Kfir Lev- Ari

Upload: von

Post on 24-Feb-2016

47 views

Category:

Documents


0 download

DESCRIPTION

Brewer’s Conjecture and the feasibility of CAP web services. (Eric Brewer) Seth Gilbert Nancy Lynch Presented by Kfir Lev-Ari. introduction. Brewer’s Conjecture (At PODC 2000) - It is impossible for a web service to provide the following three guarantees: Consistency Availability - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Brewer’s Conjecture and the feasibility of CAP web services

BREWER’S CONJECTURE AND

THE FEASIBILITY OF CAP WEB SERVICES

(Eric Brewer)Seth GilbertNancy LynchPresented by Kfir Lev-Ari

Page 2: Brewer’s Conjecture and the feasibility of CAP web services

INTRODUCTION Brewer’s Conjecture (At PODC 2000) - It is impossible for a web service to provide the following three guarantees:o Consistencyo Availabilityo Partition-tolerance

Page 3: Brewer’s Conjecture and the feasibility of CAP web services

STORY TIME The story of 2 servers 1 CAP

Page 4: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (1)

And you have a brilliant idea !You’ll create a web service

named: In order to make some you decided to start your own company in the “cloud”.

Page 5: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (2)

You’ll give your users the following API:• SetValue(Key, Value)• GetValue(Key)

And you’ll promise them two basic things:1. To be available 24/7 2. GetValue will return the last value that was set for a given key.

Page 6: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (3)

Page 7: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (4)

Page 8: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (4)

Page 9: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (5)

?

Send email

Page 10: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (6)

Page 11: Brewer’s Conjecture and the feasibility of CAP web services

MOTIVATION (7)

Page 12: Brewer’s Conjecture and the feasibility of CAP web services

FORMAL MODEL AND THEOREM

Click icon to add picture

Page 13: Brewer’s Conjecture and the feasibility of CAP web services

FORMAL MODEL (1)

• Atomic / Linearizable Consistency (of a web service) –o There must exist a total order on all operations such that each operation looks as if it were completed at a single thread.o i.e. Each server returns the right response to each request.o Equivalent to having a single up-to-date copy of the data.

Page 14: Brewer’s Conjecture and the feasibility of CAP web services

FORMAL MODEL (2)

• Availability (of a web service) –o Every request received by a non-failing node in the system must result in a response. o In other words – any algorithm used by the service must eventually terminate.o Note that there is no bound on how long the algorithm may ran before terminating, and therefore the theorem allows unbounded computation. o On the other hand, even when severe network failures occur, every request must terminate.

Page 15: Brewer’s Conjecture and the feasibility of CAP web services

FORMAL MODEL (3)

• Partition Tolerance (of a web service) –o When a network is partitioned, all messages sent from nodes in one component of the partition to nodes in another component are lost.o Note that unlike the previous two requirements, partition tolerance is really a statement about the underlying system rather than the service itself : it is the communication among the servers that is unreliable.

Page 16: Brewer’s Conjecture and the feasibility of CAP web services

NOTE – THIS CAP ISN’T MADE OF ACID• The ACID (Atomicity, Consistency, Isolation, Durability) properties focus on consistency and are the traditional approach of databases.CAP properties describe desirable network shared-data system.

• In ACID C means that a transaction preserves all the database rules, such as unique keys. (ACID consistency cannot be maintained across partitions.)

• The C in CAP refers only to single-copy consistency (request/response operation sequence), a strict subset of ACID consistency.

Page 17: Brewer’s Conjecture and the feasibility of CAP web services

ASYNCHRONOUS NETWORKS (1)

Theorem 1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:

• Availability

• Atomic consistency

in all fair executions (including those in which messages are lost).

Page 18: Brewer’s Conjecture and the feasibility of CAP web services

ASYNCHRONOUS NETWORKS (2) Proof: We prove this by contradiction. Assume an algorithm A exists that meets the three criteria: atomicity, availability, and partition tolerance.

We construct an execution of A in which there exists a request that returns an inconsistent response.

Assume that the network consists of at least two nodes. Thus it can be divided into two disjoint, non-empty sets: {G1, G2}. The basic idea of the proof is to assume that all messages between G1 and G2 are lost.

If a write occurs in G1, and later a read occurs in G2, then the read operation cannot return the results of the earlier write operation.

Page 19: Brewer’s Conjecture and the feasibility of CAP web services

ASYNCHRONOUS NETWORKS (3)

The good scenario: 1. A writes a new value of V, which we'll call V1. 2. Then a message (M) is passed from N1 to N2 which updates the copy of V there.

3. Now any read by B of V will return V1.

If the network partitions (that is messages from N1 to N2 are not delivered) then N2 contains an inconsistent value of V when step (3) occurs.

Page 20: Brewer’s Conjecture and the feasibility of CAP web services

ASYNCHRONOUS NETWORKS (4)

Corollary 1.1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:

• Availability, in all fair executions.

• Atomic consistency, in fair executions in which no messages are lost.

Page 21: Brewer’s Conjecture and the feasibility of CAP web services

ASYNCHRONOUS NETWORKS (5) Proof: The main idea is that in the asynchronous model an algorithm has no way of determining whether a message has been lost, or has been arbitrarily delayed in the transmission channel.

Therefore if there existed an algorithm that guaranteed atomic consistency in executions in which no messages were lost, then there would exist an algorithm that guaranteed atomic consistency in all executions. This would violate Theorem 1.

Page 22: Brewer’s Conjecture and the feasibility of CAP web services

FROM A TRANSACTIONAL PERSPECTIVE

• Say we have a transaction called α – in α1 A writes new values of V and in α2 B reads values of V.

• On a local system this would be easily handled by a database with some simple locking, isolating any attempt to read in α2 until α1 completes safely.

• In the distributed model though, with nodes N1 and N2 to worry about, the intermediate synchronizing message has also to complete.

• Unless we can control when α2 happens, we can never guarantee it will see the same data values α1 writes.

• All methods to add control (blocking, isolation, centralized management, etc.) will impact either partition tolerance or the availability of α1 (A) and/or α2 (B).

Page 23: Brewer’s Conjecture and the feasibility of CAP web services

SOLUTIONS IN THE ASYNCHRONOUS MODEL (1)

“2 of 3” :o CP (Atomic consistency, Partition Tolerant) : By using stronger liveness criterion, many distributed databases provide this type of guarantee, especially algorithms based on distributed locking or quorums: if certain failure patterns occur, then the liveness condition is weakened and the service no longer returns responses. If there are no failures, then liveness is guaranteed.o CA (Atomic consistency, Available) : Systems that run on intranets and LANs are an example of these types of algorithms.o AP (Available, Partition Tolerant) : Web caches are one example of a weakly consistent network.

Page 24: Brewer’s Conjecture and the feasibility of CAP web services

SOLUTIONS IN THE ASYNCHRONOUS MODEL (2)

Page 25: Brewer’s Conjecture and the feasibility of CAP web services

CIRCUMVENT THE IMPOSSIBILITY? (1)

Partially Synchronous Model• In the real world, most networks are not purely asynchronous.• If we allow each node in the network to have a clock, it is possible to build a more powerful service• In partially synchronous model - every node has a clock and all clocks increase at the same rate• However, the clocks themselves are not synchronized, in that they might display different variables at the same real time• In effect : the clocks act as timers : local state variables that the process can observe to measure how much time has passed• A local timer can be used to schedule an action to occur a certain interval of time after some other event• Furthermore, assume that every message is either delivered within a given, known time or it is lost• Also, every node processes a received message within a given, known time and local processing time

Page 26: Brewer’s Conjecture and the feasibility of CAP web services

CIRCUMVENT THE IMPOSSIBILITY? (2)

Theorem 2 It is impossible in the partially synchronous network model to implement a read/write data object that guarantees the following properties:

• Availability

• Atomic consistency

in all executions (even those in which messages are lost)

Page 27: Brewer’s Conjecture and the feasibility of CAP web services

CIRCUMVENT THE IMPOSSIBILITY? (3)

Proof:

Same methodology as in case of Theorem 1 is used.

We divide the network into two components {G1, G2} and construct an admissible execution in which a write happens in one component, followed by a read operation in the other component.

This read operation can be shown to return inconsistent data.

Page 28: Brewer’s Conjecture and the feasibility of CAP web services

CIRCUMVENT THE IMPOSSIBILITY? (4)

(Reminder from asynchronous model) Corollary 1.1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:

• Availability, in all fair executions, • Atomic consistency, in fair executions in which no messages are lost.

In partially synchronous model - the analogue of Corollary 1.1 does not hold, the proof of this corollary depends on nodes being unaware of when a message is lost.

There are partially synchronous algorithms that will1. return atomic data when all messages in an execution are delivered (i.e. there are no partitions) 2. return inconsistent data only when messages are lost.

Page 29: Brewer’s Conjecture and the feasibility of CAP web services

CIRCUMVENT THE IMPOSSIBILITY? (5)

An example of such an algorithm is the centralized protocol with single object state store modified to time-out lost messages :• On a read (or write) request, a message is sent to the central node• If a response from central node is received, then the node delivers the requested data (or an acknowledgement)• If no response is received within 2 ∗ + , then the node concludes that the message was lost• The client is then sent a response : either the best known value of the local node (for a read operation) or an acknowledgement (for a write) operation. In this case, atomic consistency may be violated.

Page 30: Brewer’s Conjecture and the feasibility of CAP web services

CIRCUMVENT THE IMPOSSIBILITY? (6)

Page 31: Brewer’s Conjecture and the feasibility of CAP web services

CAP OF CONFUSION

Click icon to add picture

Page 32: Brewer’s Conjecture and the feasibility of CAP web services

WHY “2 OF 3” IS MISLEADING? (1)

I. Partitions are rare, and there is little reason to forfeit C or A when the system is not partitioned.

II. The choice between C and A can occur many times within the same system at very fine granularity. Not only can subsystems make different choices, but the choice can change according to the operation or even the specific data or user involved.

III. All three properties are more continuous than binary:1. Availability is obviously continuous from 0% to 100%. 2. There are many levels of consistency.3. Partitions have nuances, including disagreement within the system about whether a partition exists.

Page 33: Brewer’s Conjecture and the feasibility of CAP web services

WHY “2 OF 3” IS MISLEADING? (2)

Page 34: Brewer’s Conjecture and the feasibility of CAP web services

CAP-LATENCY CONNECTION• The essence of CAP takes place during a timeout, a period when the program must make a fundamental decision – the partition decision:• cancel the operation and thus decrease availability,or• proceed with the operation and thus risk inconsistency.• In its classic interpretation, the CAP theorem ignores latency, although in practice, latency and partitions are deeply related.• Partition is a time bound on communication.• Failing to achieve consistency within the time bound (due to high latency) implies a partition and thus a choice between C and A for this operation.• In addition, some systems (for example Yahoo’s PNUTS) gives up consistency not for the goal of improving availability, but for lower latency.

Page 35: Brewer’s Conjecture and the feasibility of CAP web services

MORE PROBLEMS WITH CAP?• We saw that there is no real use in CP systems (systems that aren’t available?!) so the real meaning is that availability is only sacrificed when there is a network partition.• In practice, this means that the roles of A and C in CAP are asymmetric - Systems that sacrifice consistency (AP systems) tend to do so all the time, not just when there is a network partition.• Is there any practical difference between CA and CP systems? • As written above, CP system sacrificed availability when there is a network partition.• CA systems are not tolerance for network partitions, thus they won’t be available if there is a partition.• So practically speaking, CA and CP are identical.• The only real question is – what are you going to give up on partition, C or A?“2 out of 3” is just confusing..

Page 36: Brewer’s Conjecture and the feasibility of CAP web services

THE BIG (THEORETICAL) PICTURE

Click icon to add picture

Page 37: Brewer’s Conjecture and the feasibility of CAP web services

WHAT’S THE REAL STORY HERE? (1)

• The tradeoff between consistency and availability in a partition-prone system is an example of the general tradeoff between safety and liveness in an unreliable system :• Atomic consistency claiming that in every execution, every response is correct with respect to the “prior” operations [i.e. safety property].• Availability if an execution continues for long enough, then eventually we will get a response (something desirable happens) [i.e. liveness property].

• Understanding the relationship between safety and liveness has been long-standing challenge in distributed computing.

Page 38: Brewer’s Conjecture and the feasibility of CAP web services

WHAT’S THE REAL STORY HERE? (2)

• Replicated state machine paradigm is one of the most common approached for building reliable distributed services.• This paradigm achieves availability by replicating the service across a set of servers. The servers then agree [aka consensus] on every operation performed by the service. • The impossibility of fault-tolerant consensus implies that services built according to the replicated state machine paradigm cannot achieve both availability and consistency in an asynchronous network.(consensus impossibility was proved in 1985)

Page 39: Brewer’s Conjecture and the feasibility of CAP web services

CONCLUSION• We have shown that it impossible to reliably provide atomic consistent data when there are partitions in the network.• It is feasible, however, to achieve any two of the three properties : consistency, availability and partition tolerance.• In an asynchronous model, when no clocks are available, the impossibility result is fairly strong : it is impossible to provide consistent data, even allowing stale data to be returned when messages are lost.• However, in partially synchronous models it is possible to achieve a practical compromise between consistency and availability.

Page 40: Brewer’s Conjecture and the feasibility of CAP web services

REFERENCES1. Seth Gilbert and Nancy Lynch “Brewer’s Conjecture and the Feasi

bility of Consistent, Available, Partition-Tolerant Web Services” SigAct News, June, 2002

2. Eric Brewer “CAP Twelve Years Later: How the “Rules” Have Changed” IEEE Computer  (Volume:45 ,  Issue: 2 ) Feb. 2012

3. Seth Gilbert and Nancy Lynch “Perspectives on the CAP theorem” IEEE Computer (Volume:45 , Issue: 2 ) Feb. 2012

Page 41: Brewer’s Conjecture and the feasibility of CAP web services
Page 42: Brewer’s Conjecture and the feasibility of CAP web services

APPENDIX A – PROOF OF THEOREM 1 Let be the initial value of the atomic object.

Let be the prefix of an execution of A in which a single write of a value not equal to occurs in G1, ending with the termination of the write operation.

Assume that no other client requests occur in either G1 or G2. Further, assume that no messages from G1 are received in G2, and no messages from G2 are received in G1. We know that this write completes, by the availability requirement.

Similarly, let be the prefix of an execution in which a single read occurs in G2, and no other client requests occur, ending with the termination of the read operation.

During no messages from G2 are received in G1, and no messages from G1 are received in G2. Again we know that the read returns a value by the availability requirement. The value returned by this execution must be , as no write operation has occurred in .

Let be an execution beginning with and continuing with . To the nodes in G2, is indistinguishable from , as all the messages from G1 to G2 are lost (in both and , which together make up ), and does not include any client requests to nodes in G2.

Therefore in the execution, the read request (from ) must still return . However the read request does not begin until after the write request (from ) has completed. This therefore contradicts the atomicity property, proving that no such algorithm exists.

Page 43: Brewer’s Conjecture and the feasibility of CAP web services

APPENDIX B – PROOF OF COROLLARY 1.1

Assume for the sake of contradiction that there exists an algorithm A that always terminates, and guarantees atomic consistency in fair executions in which all messages are delivered.

Further, Theorem 1 implies that A does not guarantee atomic consistency in all fair executions, so there exists some fair execution of A in which some response is not atomic.

At some finite point in execution , the algorithm A returns a response that is not atomic. Let be the prefix of ending with the invalid response. Next, extend to a fair execution , in which all messages are delivered. The execution is now a fair execution in which all messages are delivered. However this execution is not atomic. Therefore no such algorithm A exists.

Page 44: Brewer’s Conjecture and the feasibility of CAP web services

APPENDIX C – PROOF OF THEOREM 2

We construct execution : a single write request and acknowledgement occurs in G1, and all messages between the two components {G1, G2} are lost.

Let be an execution that begins with a long interval of time during which no client requests occur.

This interval must be at least long as the entire duration of .

Then append to the events of in following manner : a single read request and response in G2 assuming all messages between the two components are lost.

Finally - we construct α by superimposing two execution and

The long interval of time in ensures that the write request competes before the read request begins.However, the read request returns the initial value, rather than the new value written by the write request, violating atomic consistency.

Page 45: Brewer’s Conjecture and the feasibility of CAP web services

BACKUP SLIDES

Page 46: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (1)While it is useful to guarantee that atomic data will be returned in executions in which all messages are delivered, it is equally important to specify what happens in executions in which some of the messages are lostWe discuss possible weaker consistency condition that allows stale data to be returned when there are partitions, yet place formal requirements on the quality of stale data returnedThis consistency guarantee will require availability and atomic consistency in executions in which no messages are lost and is therefore impossible to guarantee in the asynchronous model as a result of corollaryIn the partially synchronous model it often makes sense to base guarantees on how long an algorithm has had to rectify a situationThis consistency model ensures that if messages are delivered, then eventually some notion of atomicity is restored

Page 47: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (2)

In a atomic execution, we define a partial order of the read and write operations and then require that if one operation begins after another one ends, the former does not precede the latter in the partial order.

We define a weaker guarantee, t-Connected Consistency, which defines a partial order in similar manner, but only requires that one operation not precede another if there is an interval between the operations in which all messages are delivered

Page 48: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (3) A timed execution, α of a read-write object is t-Connected Consistent if two criteria hold. First in executions in which no messages are lost, the execution is atomic. Second, in executions in which messages are lost, there exists a partial order P on the operations in α such that :

1. P orders all write operations, and orders all read operations with respect to the write operations

2. The value returned by every read operation is exactly the one written by the previous write operation in P or the initial

value if there is no such previous write in P 3. The order in P is consistent with the order of read and write requests submitted at each node

4. Assume that there exists an interval of time longer than t in which no messages are lost. Further, assume an operation, θ completes before the interval begins, and another operation φ, begins after the interval ends. Then φ does not precede θ in the partial order P

Page 49: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (4)

t-Connected Consistency

This guarantee allows for some stale data when messages are lost, but provides a time limit on how long it takes for consistency to return, once the partition heals.

This definition can be generalized to provide consistency guarantees when only some of the nodes are connected and when connections are available only some of the time.

Page 50: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (5)

A variant of ”centralized algorithm” is t-Connected Consistent.Assume node C is the centralized node. The algorithm behaves as follows:• Read at node A : A sends a request to C from the most recent value. If A receives a response from C within time 2 ∗ tmsg + tlocal, it saves the value and returns it to the client.• Otherwise, A concludes that a message was lost and it returns the value with the highest sequence number that has ever been received from C, or the initial value if no value has yet been received from C. (When a client read request occurs at C it acts like any other node, sending messages to itself)

Page 51: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (6)• Write at A : A sends a message to C with the new value. A waits 2 ∗ tmsg + tlocal , or until it receives an acknowledgement from C and then sends an acknowledgement to the client. At this point, either C has learned of the new value, or a message was lost, or both events occurred. • If A concludes that a message was lost, it periodically retransmits the value to C (along with all values lost during earlier write operations) until it receives an acknowledgement from C. (As in the case of read operations, when a client write request occurs at C it acts like any other node, sending messages to itself)• New value is received at C: C serializes the write requests that it hears about by assigning them consecutive integer tags. Periodically C broadcasts the latest value and sequence number to all other nodes.

Page 52: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (7)

Page 53: Brewer’s Conjecture and the feasibility of CAP web services

WEAKER CONSISTENCY CONDITIONS (8)

Theorem 4 The modified centralized algorithm is t-Connected consistent