the owner share scheduler for a distributed system 2009 international conference on parallel...

21
The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李李李

Upload: eleanore-debra-hardy

Post on 03-Jan-2016

228 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

The Owner Share scheduler for a

distributed system

2009 International Conference on Parallel Processing Workshops

Reporter:1098308111李長霖

Page 2: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

Outline

Introduction Related Work The Owner-Share Enforcement Policy Performance Metrics Tests and Results Conclusion

Page 3: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

INTRODUCTION

Resource sharing represents a good method to make use of the processing capacity and data storage of a computational system.

An important requirement in this situation is that resource owners must be able to access their share of the resources whenever they want to.

Page 4: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

Owner Share Enforcement Policy (OSEP)

Implement in Condor, which supports job preemption.

OSEP dynamically adjusts the amount of allocated resources for users according to the amount of resources they provide.

Page 5: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

Related Work

Fair-Share policies try to distribute resources proportionally among users based on historical usage.

From its definition, jobs already running are not preempted and users that are postponed are compensated with heavier use at a later moment.

Page 6: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

THE OWNER-SHARE ENFORCEMENT POLICY In order to implement OSEP, the scheduling

algorithm is divided in two main parts: Dynamic Algorithm is responsible for its

dynamic behavior. Decision Algorithm is responsible for

analyzing the user information about allocated resources and making preemption decisions according to OSEP.

Page 7: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

Dynamic Algorithm

The Owner-Share enforcement system needs updated data about resource usage in order to make the appropriate decisions.

With this mechanism jobs can effectively state their requirements and preferences and, similarly, machines can specify requirements and preferences about the jobs they are willing to run.

Use Condor’s ClassAds to provide the data needed by OSEP.

Page 8: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖
Page 9: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

Decision Algorithm

Page 10: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

PERFORMANCE METRICS

Ideal case for OSEP1. 10 available nodes.

2. Two users, u1 and u2, with the same owner share of resources, 5.

3. The first user has 10 active jobs and no idle jobs at time t0, and at t1 the user u2 submits 10 jobs.

Page 11: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖
Page 12: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖
Page 13: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

A. Policy Violation

Policy Violation (PV) measures the latency introduced by the real system when compared to the ideal situation.

Page 14: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

B. Loss of Capacity

Loss of Capacity (LC) defines the amount of the system utilization that was lost due to the actions of the enforcement algorithm.

Page 15: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

C. Policy cost

The policy cost (PC) measures the overhead caused by OSEP in the system or the waste of CPU cycles by job reprocessing.

It is an important metric to evaluate the impact of checkpointing in OSEP application.

Page 16: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

D. User satisfaction

It is better for the user to wait and get more resources later than to get all the share of resources at the moment, depending on the system workload.

Page 17: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

TESTS AND RESULTSA. Evaluating the impact of environment variations User 1 and user 2 have the same share of

resources (50% for each one). User 1 submits jobs that execute for 3600

seconds each at t1 = 0s. User 2 submits jobs that execute only for 600

seconds each at moment t2 = 300s.

Page 18: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖
Page 19: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

B. Evaluating the impact of several users and different shares System with 20 resources; Different Shares: user 1 with 30%, user 2 with 40% and

user 3 and 4 with 15% each; 15 jobs per user, job size ranging from 600 to 3600

seconds each; t1 =0s, t2 =100s, t3 =200s e t4 =210s, where tx is the

moment when user x submits his/her jobs;

Page 20: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖
Page 21: The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter:1098308111 李長霖

CONCLUSION

This paper presented a scheduling infrastructure for the Owner Share scheduler based on the distributed ownership concept.

The results showed that the OSEP policy aims to provide a user satisfaction proportional to the user’s share of resources.