load balance in linux 2.6.32 load balancing sung-joon choi real-time operating systems lab. seoul...
TRANSCRIPT
Load Balance in Linux 2.6.32
Load balancing
Sung-joon Choi
Real-Time Operating Systems Lab.
Seoul National University2011-09-15
2
Contents
Load balancingPurpose
Definition
General cases• Active load balancing• Passive load balancing
Special cases• Execution of a new task• CPU’s shut down or intentionally being IDLE
Limitation
Load Balance in Linux 2.6.32
3
Load Balancing
Purpose시스템에 코어 수보다 많은 수의 작업 (task) 이 있는 한 , 모든 코어가 IDLE 상태 없이 수행하도록 조절
Mechanism코어 간에 작업량 차이가 크지 않도록 조절
DefinitionLoad balancing
• SMP 구조에서 각 코어가 균등한 작업량 (load) 을 가지도록 조절하는 것
Load• 코어의 run-queue 가 갖는 모든 task 들의 weight 를 더한 값
Load Balance in Linux 2.6.32
4
Load Balancing
Definition (cont.)Idlest run-queue
• A run-queue that has the minimum load among the cores
Busiest run-queue• A run-queue that has the maximum value which is scale factor
“load / (core’s power)”• 모든 코어의 power 가 동일하다면 maximum load 를 갖는
코어의 run-queue 를 의미한다• 이종의 프로세서를 사용하는 시스템이라면 각 코어의 power
가 다를수도 있다 . • 일반적으로 power 는 capacity 또는 작업수행능력을 의미한
다 .
Load Balance in Linux 2.6.32
5
Contents
Load balancingPurpose
Definition
General cases (mainly focused part)• Active load balancing• Passive load balancing
Special cases• Execution of a new task• CPU’s shut down or intentionally being IDLE
Limitation
Load Balance in Linux 2.6.32
6
General Cases
Active Load Balancing
Load Balance in Linux 2.6.32
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2 Task 4
Task 3 Task 5
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 4Task 3
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 4Task 3
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 3
READY RUNNING Going to DEAD
Core 1 is going to IDLE
Run-queue is empty
Task migration
Task 2
(Assumption: all tasks have same weight)
7
Active Load Balancing
ImplementationWhen a task is going to end up its execution time
do_exit()• Sets task’s state to “TASK_DEAD”• schedule()
– In back-end procedure, if a core’s state is IDLE, it calls “idle_balance()”
– idle_balance()» To pull a task on the busiest core’s run-queue, it calls
“load_balance()”» load_balance()
» Does a task migration
General Cases
8
Active Load Balancing
DrawbackActive load balancing 으로도 충분히 load balancing 을 달성할 수 있지만코어 간 작업량 차이가 큰 상황인데도 각 태스크의 수행시간이 길어서 IDLE 상태를 갖게 되는 코어가 한동안 없다면 , 단기간 내 load balancing 의 목적을 달성할 수 없다 .
이 상황을 피하기 위해서 주기적인 조절이 필요하다
General Cases
9
General Cases
Passive(Periodic) load balancing
Load Balance in Linux 2.6.32
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5
Task 4 Task 6
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5Task 3
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5Task 3
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 4
Task 6
Task 3
Busiest run-queue Idlest run-queue
Task 2Task 3
For a long time, there is no IDLE core
Task 4
Task 6
Periodic checkIf there is big gap of load be-
tween cores, it is uncomfortable
Task 4
Task 6
Task 1
Task 2
Task 5
Task migration
READY RUNNING Going to DEAD
(Assumption: all tasks have same weight)
10
Passive Load Balancing
Triggered by scheduler_tick()Tick value is compared with a parameter “next_balance” which is the time to do load balancing
• Each run-queue has “next_balance”• If a core takes the active load balancing, the parameter is set to
1 second after• If a core takes the passive load balancing, the parameter is set
to 1 minute after• 1 초와 1 분의 차이는 IDLE 상태를 밸런싱했던 코어는 다시
IDLE 상태가 되기 쉽기 때문에 곧바로 밸런싱을 해주기 위한 것
Executed by bottom-half handlerA softirq named “SCHED_SOFTIRQ” is handled by “run_rebalance_domains()”
General Cases
11
Passive Load Balancing
Implementation – start load balance
General Cases
Timer interrupt invokes “scheduler_tick()”
If the tick value is equal to or greater than parameter “next_balance”,
Busiest run-queue Idlest run-queueREADY RUNNING
(Assumption: all tasks have same weight)
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5 Task 7
Task 3
Task 6
Next_balance Next_balance
Task 4
12
Passive Load Balancing
Implementation – step1
General Cases
If the tick value is equal to or greater than parameter “next_balance”, Step1: raises a softirq “SCHED_SOFTIRQ” to kernel
Softirq table
…???
SCHED_SOFTIRQ
Busiest run-queue Idlest run-queueREADY RUNNING
(Assumption: all tasks have same weight)
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5 Task 7
Task 3
Task 6
Next_balance Next_balance
Task 4
13
Passive Load Balancing
Implementation – step2
General Cases
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
ksoftirqd
Task 5 Task 7
Busiest run-queue Idlest run-queue
Task 3
READY RUNNING
(Assumption: all tasks have same weight)
If the tick value is equal to or greater than parameter “next_balance”, Step1: raises a softirq “SCHED_SOFTIRQ” to kernelStep2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Softirq table
…???
SCHED_SOFTIRQ
Task 6
Next_balance Next_balance
Task 4
14
Passive Load Balancing
Implementation – step3
General Cases
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
ksoftirqdTask 5
Task 7
Busiest run-queue Idlest run-queue
Task 3
READY RUNNING
(Assumption: all tasks have same weight)
If the tick value is equal to or greater than parameter “next_balance”, Step1: raises a softirq “SCHED_SOFTIRQ” to kernelStep2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Softirq table
…???
SCHED_SOFTIRQ
Task 6
Next_balance Next_balance
Task 4
Step3: the thread executes a function “do_ksoftirqd()” that picks a softirq and calls its handler function
run_rebalance_domains()
Handler function (bottom-half handler)
15
Passive Load Balancing
Implementation – step4
General Cases
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
ksoftirqdTask 5
Task 7
Busiest run-queue Idlest run-queue
Task 3
READY RUNNING
(Assumption: all tasks have same weight)
If the tick value is equal to or greater than parameter “next_balance”, Step1: raises a softirq “SCHED_SOFTIRQ” to kernelStep2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Softirq table
…???
SCHED_SOFTIRQ
Task 6
Next_balance Next_balance
Task 4
Step3: the thread executes a function “do_ksoftirqd()” that picks a softirq and calls its handler functionStep4: the handler function finds the busiest run-queue to pull a task
run_rebalance_domains()
Handler function (bottom-half handler)
16
Passive Load Balancing
Implementation – step5
General Cases
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
ksoftirqdTask 5
Task 7
Busiest run-queue Idlest run-queue
Task 3
READY RUNNING
(Assumption: all tasks have same weight)
If the tick value is equal to or greater than parameter “next_balance”, Step1: raises a softirq “SCHED_SOFTIRQ” to kernelStep2: finds the idlest run-queue to invoke a kernel thread “ksoftirqd”
Softirq table
…???
SCHED_SOFTIRQ
Task 6
Next_balance Next_balance
Task 4
Step3: the thread executes a function “do_ksoftirqd()” that picks a softirq and calls its handler functionStep4: the handler function finds the busiest run-queue to pull a taskStep5: task migration
run_rebalance_domains()
Handler function (bottom-half handler)
Task 4
17
Passive Load Balancing
Implementation
General Cases
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5
Task 6Task 3
Task 4
Next_balance Next_balance
Task 7
Run-queue Run-queue
Current task
Current task
Core 0 Core 1
Task 1
Task 2
Task 5 Task 7
Task 3
Task 6
Next_balance Next_balance
Task 4
18
Passive Load Balancing
DrawbackThis algorithm has large overhead
• The algorithm should check the maximum and minimum load out of all cores
• And, if a current core is not the idlest one, – The kernel thread “ksoftirqd” should be enqueued to the idlest run-
queue of other core and waken up– Also, a current task of the target core that has the idlest run-queue
is preempted by “ksoftirqd”
Tradeoff: balancing time interval throughput latency
General Cases
19
Contents
Load balancingPurpose
Definition
General cases• Active load balancing• Passive load balancing
Special cases• Execution of a new task• CPU’s shut down or intentionally being IDLE
Limitation
Load Balance in Linux 2.6.32
20
Special Cases
Execution of a new taskWhen a new task is created in one core, kernel checks the core’s load whether it is reasonable to handle a new task
• If the load is unacceptable, current task of the core is migrated to the idlest core’s run-queue and rescheduled
• And a new task is executed in the core (not the idlest core)
CPU’s shut down or intentionally being IDLEWhen one core should be shut down or intentionally be IDLE, such as in POWER_SAVING_LOAD_BALANCE
All tasks in its run-queue are migrated to other cores
Actually, this case is just a task migration
Load Balance in Linux 2.6.32
21
Contents
Load balancingPurpose
Definition
General cases• Active load balancing• Passive load balancing
Special cases• Execution of a new task• CPU’s shut down or intentionally being IDLE
Limitation
Load Balance in Linux 2.6.32
22
Limitation
Global Fairness Global Fairness 는 여러 개의 CPU 로 이루어진 SMP 에서 모든 task 가 자신의 weight 에 비례해서 run-time 을 보장받는 정도를 의미한다 .
SMP 환경에서 Run queue 가 CPU 에 하나씩 있고 , Load Balance 는 각 Run queue 의 load(sum of weight) 만을 고려해서 task 를 옮기므로 task 가 자신의 weight 에 비례한 시간을 못 받는 경우가 생긴다 .
• Example) Dual-core CPU 에 서로 같은 weight 를 갖는 task1, 2, 3 가 있을 때 CPU1 의 Run-queue 에는 task1 이 있고 , CPU2 의 Run-queue 에는 task2, task3 이 들어간다 . 이 경우 load balance 가 잘 일어나지 않으므로 서로 같은 weight 를 갖고 있음에도 같은 run-time 을 보장 받지 못한다 .
Load Balance in Linux 2.6.32
23
End
Q & A?
CFS in Linux 2.6.37