securemr: a service integrity assurance framework for mapreduce author: wei wei, juan du, ting yu,...

20
SecureMR: A Service Integri ty Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications Conference, 2009, pp.73-82. Presenter: Tsuei-Hung Sun ( 孫孫孫 ) Date: 2010/9/17

Upload: lucas-marshall

Post on 29-Dec-2015

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

SecureMR: A Service Integrity Assurance Framework for MapReduce

Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu

Source: Annual Computer Security Applications Conference, 2009, pp.73-82.

Presenter: Tsuei-Hung Sun (孫翠鴻 )

Date: 2010/9/17

Page 2: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

2

Outline

• Introduction

• Motivation

• Contribution

• Scheme

• Security analysis

• Performance evaluation

• Comment

Page 3: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

3

Introduction

• MapReduce– A parallel data processing model to simplify parall

el data processing on large clusters.

– Proposed by Google.

– It is mainly running on clusters belonging to a single administration domain.

Yahoo’s Hadoop

– Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (Amazon S3).

Page 4: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

4

Introduction

Fig. The MapReduce data processing reference model.

M1.

M2. M3.

R1.

R2. R3.

(Distributed File System)

Page 5: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

5

Introduction

Fig. Combine multiple map and reduce phases.

Page 6: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

6

Introduction

• Data processing service integrity Replication-based techniques

– Sampling techniques

– Checkpoint-based verification

Page 7: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

7

Motivation

• Existing address the service integrity, but not on data processing service.

• Replication-based techniques drawback – Replicate all distributed computing tasks for

consistency verification is not efficiency.

– Not scalable to perform centralized consistency verification over massive result data.

Page 8: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

8

Contribution

• Decentralized replication-based integrity verification for MapReduce in open systems.

• Achieves security: non-repudiation, resilience to DoS attacks and replay attacks.

• Security components can be easily integrated into existing MapReduce implementations.

• Low performance overhead.• The first attempt to address data processing servi

ce.

Page 9: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

10

Scheme

• SecureMR - Architecture Design

Page 10: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

11

Scheme

• SecureMR - Communication Design

Commitment protocol

Verification protocol

Page 11: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

12

Scheme

• Commitment Protocol

IDMap: a monotonically increasing identity of a map task. DataLoc: input data block location. sig: Master’s signature. KpubM: Mapper’s public key. sigM: Mapper’s signature.HP1,…,HPr: hash value for each partition of its intermediate result

SchedulerTask Executor

Commit Manager

Page 12: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

13

Scheme• Verification Protocol

Pi: partition of intermediate results that the reducer will process. ADM: Mapper’s address. HPi: Pi partition committed by the Committer. ReqSeq: sequence number.

Task Executor

Manager

Scheduler

Verifier

CommitterVerifier

Committer VerifierManager

Verifier

sigR

Page 13: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

14

Scheme

• Extension for Reducers and MapReduce Chain

MapPhase

MapPhase

ReducePhase

ReducePhase

VerifyPhase

Add Verifier componentAdd Committer component

Page 14: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

15

Security analysis

• Collusive Attack - Attacker behavior analysis– Periodical Attacker

• Naive attacker

• Without collusion attacker

• With collusion attacker

– Strategic Attacker

Page 15: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

16

Security analysis

Fig. Detection Rate for Non-Collusion Naive Attacker.

Fig. Detection Rate for Non-Collusion Periodical Attacker.

b = 20; Pm = 1 b = 20; Pm = 0.5

b : block number of one input job. Pm: misbehaving probability.l: misbehavior of mapper is detected when he do number of jobs.

Page 16: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

17

Security analysis

Fig. Detection Rate for CollusionPeriodical Attacker.

Fig. Misbehaving Probability vs.Duplication Rate.

n : total worker number. m: malicious workers

n = 50; Pm = 0.5; b=20; l = 15n = 50; b =20; l = 15

Page 17: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

18

Performance evaluation

T: time D: data transmission cost. r: number of reducers.

Page 18: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

19

Performance evaluation

Fig. Response Time vs. Numberof Reduce Tasks. Fig. Response Time vs. Data Size.

number of map task = 60; Data Size = 1GB number of map task = 60;number of reduce task =25

Page 19: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

20

Performance evaluation

Fig. Response time vs. Duplication Rate.Fig. Response time vs. Number of Reduce Tasks.

number of map task = 60; Data Size = 1GB

Page 20: SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications

21

Comment

• Assign and Notify can combine into one step.

• TicketM contain some parameters are the same as reducer sign part in request massage.

• If first request is failure, how can reducer do? (TicketM and ReqSeq how to renew)

• In Response massage, mapper can sign Data together that can avoid one hash and reducer also didn’t need to check it.