processes. 2 process concept what is the process? –an instance of a program in execution. –an...
TRANSCRIPT
Processes
2
Process Concept• What is the process?
– An instance of a program in execution.– An encapsulation of the flow of control in a program.– A dynamic and active entity.– The basic unit of execution and scheduling.– A process is named using its process ID (PID).– Job, task, or sequential process
3
Process Creation (1)• Process Hierarchy
– One process can create another process: parent-child relation-ship
– UNIX calls the hierarchy a “pro-cess group”
– Windows has no concept of process hierarchy.
– (cf) ps in UNIX, taskmgr(Task Manager) in Windows
sh
$ cat file1 | a.out
cat a.out
4
Process Creation (2)• Process creation events
– Calling a system call• fork() in POSIX, CreateProcess() in Win32• Shells or GUIs use this system call internally.
– System initialization• init process
• Background processes– Do not interact with users– Daemons
5
Process Creation (3)• Resource sharing
– Parent may inherit all or a part of resources and privi-leges for its children• UNIX: User ID, open files, etc.
• Execution– Parent may either wait for it to finish, or it may con-
tinue in parallel.• Address space
– Child duplicates the parent’s address space or has a program loaded into it.
6
Process Termination• Process termination events
– Normal exit (voluntary)– Error exit (voluntary)– Fatal error (involuntary)
• Exceed allocated resources• Segmentation fault• Protection fault, etc.
– Killed by another process (involuntary)• By receiving a signal
7
fork()
#include <sys/types.h>#include <unistd.h>
int main(){ int pid;
if ((pid = fork()) == 0) /* child */ printf (“Child of %d is %d\n”, getppid(), getpid()); else /* parent */ printf (“I am %d. My child is %d\n”, getpid(), pid);}
% ./a.outI am 31098. My child is 31099.Child of 31098 is 31099.
% ./a.outChild of 31100 is 31101.I am 31100. My child is 31101.
8
Simplified UNIX Shellint main(){ while (1) { char *cmd = read_command(); int pid; if ((pid = fork()) == 0) { /* Manipulate stdin/stdout/stderr for pipes and redirections, etc. */ exec(cmd); panic(“exec failed!”); } else { wait (pid); } }}
9
Cooperating Processes• Advantages
– Information sharing– Computation speed-up– Modularity– Convenience– (cf.) independent processes
• Issues– Inter-process communication (IPC)– Synchronization– Deadlocks, etc.
10
Process State Transition (1)
11
Process State Transition (2)
• Linux example
RunnableSleepingTraced or StoppedUninterruptible SleepZombie
R:S:T:
D:
Z:
No resident pagesHigh-priority taskLow-priority taskHas pages lockedinto memory
W:<:N: L:
12
Process Data Structures• PCB (Process Control Block)
– Each PCB represents a process.– Contains all of the information about a process
• Process state• Program counter• CPU registers• CPU scheduling information• Memory management information• Accounting information• I/O status information, etc.
– task_struct in Linux• 1456 bytes as of Linux 2.4.18
13
Process Control Block (PCB)
14
PCBs and Hardware State• When a process is running:
– Its hardware state is inside the CPU:PC, SP, registers
• When the OS stops running a process:– It saves the registers’ values in the PCB.
• When the OS puts the process in the running state:– It loads the hardware registers from the values in
that process’ PCB.
15
Context Switch (1)
16
Context Switch (2)• Context Switch (or Process Switch)
– The act of switching the CPU from one process to an-other.
– Administrative overhead• saving and loading registers and memory maps• flushing and reloading the memory cache• updating various tables and lists, etc.
– Context switch overhead is dependent on hardware support.• Multiple register sets in UltraSPARC.• Advanced memory management techniques may
require extra data to be switched with each con-text.
– 100s or 1000s of switches/s typically.
17
Context Switch (3)• Linux example
– Total 237,961,696 ticks = 661 hours = 27.5 days– Total 142,817,428 context switches– Roughly 60 context switches / sec
// user nice system idle
18
Process State Queues (1)• State Queues
– The OS maintains a collection of queues that repre-sent the state of all processes in the system• Job queue• Ready queue• Wait queue(s): there may be many wait queues,
one for each type of wait (device, timer, message, …)
– Each PCB is queued onto a state queue according to its current state.
– As a process changes state, its PCB is migrated be-tween the various queues.
19
Process State Queues (2)
20
Process State Queues (3)• PCBs and State Queues
– PCBs are data structures• dynamically allocated inside OS memory
– When a process is created:• OS allocates a PCB for it• OS initializes PCB• OS puts PCB on the correct queue
– As a process computes:• OS moves its PCB from queue to queue
– When a process is terminated:• OS deallocates its PCB
21
Process Creation: NT
• CreateProcess()– Creates and initializes a new PCB– Creates and initializes a new address space– Loads the program specified by “prog” into the ad-
dress space– Copies “args” into memory allocated in address
space– Initializes the hardware context to start execution at
main– Places the PCB on the ready queue.
BOOL CreateProcess (char *prog, char *args, …)BOOL CreateProcess (char *prog, char *args, …)
22
Process Creation: UNIX (1)
• fork()– Creates and initializes a new PCB– Creates and initializes a new address space– Initializes the address space with a copy of the entire
contents of the address space of the parent.– Initializes the kernel resources to point to the re-
sources used by parent (e.g., open files)– Places the PCB on the ready queue.– Returns the child’s PID to the parent, and zero to the
child.
int fork()int fork()
23
Process Creation: UNIX (2)
• exec()– Stops the current process– Loads the program “prog” into the process’ address space.– Initializes hardware context and args for the new program.– Places the PCB on the ready queue.
• Note: exec() does not create a new process.– What does it mean for exec() to return?
int exec (char *prog, char *argv[])int exec (char *prog, char *argv[])
24
Long-term Scheduler• Job scheduler
– Selects which processes should be brought into the ready queue.
– Controls the degree of multiprogramming– Should select a good mix of I/O-bound and CPU-
bound processes– Time-sharing systems such as UNIX often has no
long-term scheduler• Simply put every new process in memory.• Depends either on a physical limitation or on the
self-adjusting nature of human users.
25
Short-term Scheduler• CPU scheduler
– Selects which process should be executed next and allocates CPU.
– Should be fast!– Scheduling criteria:
• CPU utilization• Throughput• Turnaround time• Waiting time• Response time
26
Mid-term Scheduler• Swapper
– Removes processes from memory temporarily.– Reduces the degree of multiprogramming.– Can improve the process mix dynamically. – Swapping is originally proposed to reduce the mem-
ory pressure.
27
Why fork()?• Very useful when the child…
– is cooperating with the parent.– relies upon the parent’s data to accomplish its task.– Example: Web server
While (1) { int sock = accept(); if ((pid = fork()) == 0) { /* Handle client request */ } else { /* Close socket */ }}
28
Why not fork()?• heavy operation• need not parent’s context
29
Take a Break !
Threads
31
Processes• Heavy-weight
– A process includes many things:• An address space (all the code and data pages)• OS resources (e.g., open files) and accounting
info.• Hardware execution state (PC, SP, registers, etc.)
– Creating a new process is costly because all of the data structures must be allocated and initialized• Linux: over 100 fields in task_struct
(excluding page tables, etc.)– Inter-process communication is costly, since it must
usually go through the OS• Overhead of system calls and copying data
32
Cooperating Processes (Re-visited)
• Example– A web server, which forks off copies of itself to handle mul-
tiple simultaneous tasks– Any parallel program on a multiprocessor
• We need to:– Create several processes that execute in parallel– Cause each to map the same address space to share data
(e.g., shared memory)– Have the OS schedule these processes in parallel (logically
or physically)• This is very inefficient!
– Space: PCB, page tables, etc.– Time: creating OS structures, fork and copy address space,
etc.
33
Rethinking Processes• What’s similar in these cooperating processes?
– They all share the same code and data (address space)
– They all share the same privilege– They all share the same resources (files, sockets,
etc.)
• What’s different?– Each has its own hardware execution state: PC, regis-
ters, SP, and stack.
34
Key Idea (1)• Separate the concept of a process from its ex-
ecution state– Process: address space, resources, other general
process attributes (e.g., privileges)– Execution state: PC, SP, registers, etc.
– This execution state is usually called • a thread of control,• a thread, or• a lightweight process (LWP)
35
Key Idea (2)
36
Key Idea (3)
37
What is a Thread?• A thread of control (or a thread)
– A sequence of instructions being executed in a pro-gram.
– Usually consists of• a program counter (PC)• a stack to keep track of local variables and return
addresses• registers
– Threads share the process instructions and most of its data.• A change in shared data by one thread can be
seen by the other threads in the process– Threads also share most of the OS state of a process.
38
Processes vs. Threads• Processes vs. Threads
– A thread is bound to a single process.– A process, however, can have multiple threads.– Sharing data between threads is cheap: all see the
same address space.– Threads become the unit of scheduling.– Processes are now containers in which threads exe-
cute.– Processes become static, threads are the dynamic
entities.
39
Threads Design Space
address space
thread
one thread/process
many processes
many threads/process
many processes
one thread/process
one process
many threads/process
one process
MS/DOS
Java
olderUNIXes
Mach, NT,Chorus,Linux, …
process
thread
40
Multithreading• Benefits
– Creating concurrency is cheap.– Improves program structure.– Throughput
• By overlapping computation with I/O operations– Responsiveness (User interface / Server)
• Can handle concurrent events (e.g., web servers)– Resource sharing– Economy– Utilization of multiprocessor architectures
• Allows building parallel programs.
41
Process Address Space
0x00000000
0xFFFFFFFF
address space
code(text segment)
code(text segment)
static data(data segment)
static data(data segment)
heap(dynamically allocated mem)
heap(dynamically allocated mem)
stack(dynamic allocated mem)
stack(dynamic allocated mem)
PC
SP
42
Address Space with Threads
0x00000000
0xFFFFFFFF
address space
PC (T2)
SP (T2)
code(text segment)
code(text segment)
static data(data segment)
static data(data segment)
heap(dynamically allocated mem)
heap(dynamically allocated mem)
thread 1 stackthread 1 stack
thread 2 stackthread 2 stack
thread 3 stackthread 3 stack
SP (T1)
SP (T3)
PC (T1)PC (T3)
SP
PC
43
Concurrent Servers: Processes
• Web server example– Using fork() to create new processes to handle re-
quests in parallel is overkill for such a simple task.
While (1) { int sock = accept(); if ((pid = fork()) == 0) { /* Handle client request */ } else { /* Close socket */ }}
44
Concurrent Servers: Threads
• Using threads– We can create a new thread for each request.
webserver (){ While (1) { int sock = accept(); thread_fork (handle_request, sock); }}handle_request (int sock){ /* Process request */ close (sock);}
45
Threads Interface (1)• Pthreads
– A POSIX standard (IEEE 1003.1c) API for thread cre-ation / synchronization.
– API specifies behavior of the thread library.– Implementation is up to development of the library.– Common in UNIX operating systems.
46
Threads Interface (2)• POSIX-style threads
– Pthreads– DCE threads (early version of Pthreads)– Unix International (UI) threads (Solaris threads)
• Sun Solaris 2, SCO Unixware 2
• Microsoft-style threads– Win32 threads
• Microsoft Windows 98/NT/2000/XP– OS/2 threads
• IBM OS/2
47
Pthreads (1)• Thread creation/termination
int pthread_create (pthread_t *tid, pthread_attr_t *attr, void *(start_routine)(void *), void *arg);
int pthread_create (pthread_t *tid, pthread_attr_t *attr, void *(start_routine)(void *), void *arg);
void pthread_exit (void *retval); void pthread_exit (void *retval);
int pthread_join (pthread_t tid, void **thread_return);
int pthread_join (pthread_t tid, void **thread_return);
48
Pthreads (2)• Mutexes
int pthread_mutex_init (pthread_mutex_t *mutex, const pthread_mutexattr_t *mattr);
int pthread_mutex_init (pthread_mutex_t *mutex, const pthread_mutexattr_t *mattr);
void pthread_mutex_destroy (pthread_mutex_t *mutex);
void pthread_mutex_destroy (pthread_mutex_t *mutex);
void pthread_mutex_lock (pthread_mutex_t *mutex);
void pthread_mutex_lock (pthread_mutex_t *mutex);
void pthread_mutex_unlock (pthread_mutex_t *mutex);
void pthread_mutex_unlock (pthread_mutex_t *mutex);
49
Pthreads (3)• Condition variables
int pthread_cond_init (pthread_cond_t *cond, const pthread_condattr_t *cattr);
int pthread_cond_init (pthread_cond_t *cond, const pthread_condattr_t *cattr);
void pthread_cond_destroy (pthread_cond_t *cond);
void pthread_cond_destroy (pthread_cond_t *cond);
void pthread_mutex_wait (pthread_cond_t *cond,
pthread_mutex_t *mutex);
void pthread_mutex_wait (pthread_cond_t *cond,
pthread_mutex_t *mutex);
void pthread_cond_signal (pthread_cond_t *cond);
void pthread_cond_signal (pthread_cond_t *cond);
void pthread_cond_broadcast (pthread_cond_t *cond);
void pthread_cond_broadcast (pthread_cond_t *cond);
50
Threading Issues (1)• fork() and exec()
– When a thread calls fork(),• Does the new process duplicate all the threads? • Is the new process single-threaded?
– Some UNIX systems support two versions of fork().• In Pthreads,
– fork() duplicates only a calling thread.• In the Unix International standard,
– fork() duplicates all parent threads in the child.
– fork1() duplicates only a calling thread.– Normally, exec() replaces the entire process.
51
Threading Issues (2)• Thread cancellation
– The task of terminating a thread before it has com-pleted.
– Asynchronous cancellation• Terminates the target thread immediately.• What happens if the target thread is holding a re-
source, or it is in the middle of updating shared resources?
– Deferred cancellation• The target thread is terminated at the cancella-
tion points. (see note)– Pthreads API supports both asynchronous and de-
ferred cancellation.
52
Threading Issues (3)• Signal handling
– Where should a signal be delivered?– To the thread to which the signal applies.
• for synchronous signals (divide by zero).– To every thread in the process (control-C).– To certain threads in the process.
• typically only to a single thread found in a process that is not blocking the signal.
• Pthreads – per-process pending signals, per-thread blocked
signal mask– Assign a specific thread to receive all signals for the
process.• Solaris 2
53
Threading Issues (4)• Thread pools
– Create a number of threads at process startup and place them into a pool.
– When a server receives a request, it awakens a thread from this pool.
– Once the thread completes its service, it returns to the pool awaiting more work.
– Benefits• Faster to service a request with an existing thread
than waiting to create a thread.• Limits the number of threads that exist at any one
point.
54
Threading Issues (5)• Thread-specific data (aka TLS: thread local stor-
age)– Allows data to be associated with each thread– applies for static and global variables
• others are on the stack– API’s
• int pthread_key_create (pthread_key_t *key, void (*destory_fn)(void *));
• TlsAlloc – Windows• ThreadLocal class – Java• _declspec(thread) int a; C++• C#, Python, …, etc
55
Threading Issues (6)• Using libraries
– errno• Each thread should have its own independent
version of the errno variable.– Multithread-safe (MT-safe)
• A set of functions is said to be multithread-safe or reentrant, when the functions may be called by more than one thread at a time without requiring any other action on the caller’s part.
• Pure functions that access no global data or ac-cess only read-only global data are trivially MT-safe.
• Functions that modify global state must be made MT-safe by synchronizing access to the shared data.
56
57
Kernel/User-Level Threads• Who is responsible for creating/managing threads?
– The OS (kernel threads)• thread creation and management requires system calls
– The user-level process (user-level threads)• A library linked into the program manages the threads
• Why is user-level thread management possible?– Threads share the same address space
• The thread manager doesn’t need to manipulate ad-dress spaces
– Threads only differ in hardware contexts (roughly)• PC, SP, registers• These can be manipulated by the user-level process it-
self.
58
Kernel-Level Threads (1)• OS-managed threads
– The OS manages threads and processes.– All thread operations are implemented in the kernel.– The OS schedules all of the threads in a system.
• If one thread in a process blocks (e.g., on I/O), the OS knows about it, and can run other threads from that process.
• Possible to overlap I/O and computation inside a process.
– Kernel threads are cheaper than processes.• Less state to allocate and initialize
– Windows 98/NT/2000/XPSolaris: Lightweight processesTru64 UnixLinux
59
Kernel-level Threads (2)• Limitations
– They can still be too expensive.• For fine-grained concurrency, we need even cheaper threads.• Ideally, we want thread operations as fast as a procedure call.
– Thread operations are all system calls.• The program must cross an extra protection boundary on every
thread operation, even when the processor is being switched between threads in the same address space.
• The OS must perform all of the usual argument checks.– Must maintain kernel state for each thread.
• Can place limit on the number of simultaneous threads. (typi-cally ~1000)
– Kernel-level threads have to be general to support the needs of all programmers, languages, runtimes, etc.
60
Implementing Kernel-level Threads
• Kernel-level threads– Kernel-level threads
are similar to origi-nal process man-agement and im-plementation.
61
User-level Threads (1)• Motivation
– To make threads cheap and fast, they need to be imple-mented at the user level.
– Portable: User-level threads are managed entirely by the runtime system (user-level library).
• User-level threads are small and fast– Each thread is represented simply by a PC, registers, a
stack, and a small thread control block (TCB).– Creating a thread, switching between threads, and syn-
chronizing threads are done via procedure calls (No kernel involvement).
– User-level thread operations can be 10-100x faster than kernel-level threads.
62
User-level Threads (2)• Limitations
– But, user-level threads aren’t perfect.– User-level threads are invisible to the OS.
• They are not well integrated with the OS– As a result, the OS can make poor decisions.
• Scheduling a process with only idle threads• Blocking a process whose thread initiated I/O, even
though the process has other threads that are ready to run.
• Unscheduling a process with a thread holding a lock.– Solving this requires coordination between the kernel and
the user-level thread manager.• e.g., all blocking system calls should be emulated in the
library via non-blocking calls to the kernel.
63
Implementing User-level Threads (1)
• User-level threads
64
• Thread context switch– Very simple for user-level threads
• Save context of currently running thread: push all machine state onto its stack
• restore context of the next thread: pop machine state from next thread’s stack
• the next thread becomes the current thread• return to caller as the new thread
: execution resumes at PC of next thread– All done by assembly languages
• It works at the level of the procedure calling con-vention, so it cannot be implemented using pro-cedure calls.
Implementing User-level Threads (2)
65
Implementing User-level Threads (3)
• Thread scheduling– A thread scheduler determines when a thread runs.
• Just like the OS and processes• But implemented at user-level as a library
– It uses queues to keep track of what threads are do-ing.• Run queue: threads currently running• Ready queue: threads ready to run• Wait queue: threads blocked for some reason (maybe blocked on I/O or a lock)
– How can we prevent a thread from hogging the CPU?
66
Implementing User-level Threads (4)
• Non-preemptive scheduling– Force everybody to cooperate
• Threads willingly give up the CPU by calling yield().– yield() calls into the scheduler, which context switches to
another ready thread.
– What happens if a thread never calls yield()?
Thread ping (){ while (1) { printf (“ping\n”); yield(); }}
Thread ping (){ while (1) { printf (“ping\n”); yield(); }}
Thread pong (){ while (1) { printf (“pong\n”); yield(); }}
Thread pong (){ while (1) { printf (“pong\n”); yield(); }}
67
Implementing User-level Threads (5)
• Preemptive scheduling– Need to regain control of processor asynchronously.– Scheduler requests that a timer interrupt be deliv-
ered by the OS periodically.• Usually delivered as a UNIX signal• Signals are just like software interrupts, but deliv-
ered to user-level by the OS instead of delivered to OS by hardware
– At each timer interrupt, scheduler gains control and context switches as appropriate.
68
Linux Threads (1)• LinuxThreads implementation
– http://pauillac.inria.fr/~xleroy/linuxthreads/– In Linux, the basic unit is a “task”.
• In a program that only calls fork() and/or exec(), a task is identi-cal to a process.
• A task uses the clone() system call to implement multithread-ing.
– One-to-one model• Linux creates a task for each application thread using clone().
– Resources can be shared selectively in clone():• CLONE_PARENT ; parent process • CLONE_FS ; FS root, current working dir., umask, …• CLONE_FILES ; file descriptor table• CLONE_SIGHAND ; signal handler table• CLONE_VM, etc.; address space
69
Linux Threads (2)• POSIX compatibility problems
– Basic difference in multithreading model• POSIX: a single process that contains one or more threads.• Linux: separate tasks that may share one or more resources.
– Resources• POSIX: The following resources are specific to a thread, while
all other resources are global to a process.– CPU registers– User stack– Blocked signal mask
• Linux: The following resources may be shared between tasks via clone(), while all other resources are local to each task.
– Address space– Signal handlers– Open files– Working directory
• PID, PPID, UID, GID, pending signal mask, …???
70
Linux Threads (3)• POSIX compatibility problems (cont’d)
– Signals• POSIX: all signals sent to a process will be collected into
a process-wide set of pending signals, then delivered to any thread that is not blocking that signal.
• Linux: Linux only supports signals that are sent to a specific task. If that task has blocked that particular signal, it may remain pending indefinitely.
• Approaches for POSIX compliance– Linux 2.4 introduces a concept of “thread groups”.– NPTL (Native POSIX Threading Library) – by RedHat
• 1:1 model – NGPT (Next Generation POSIX Threading) – by IBM
• M:N model
Scheduler Acti-vations:
.
Thomas E. Anderson, Brian N. Bershad, Edward D. La-
zowska, and Henry M. Levy.
Parallelism Vehicles• Process
– source of overhead: address space• kernel threads
– kernel support multiple threads per address space– no problem integrating with kernel– too heavyweight for parallel programs
– 10*(user thread) < performance < 10*(process)
• user level threads– fast, but– kernel knows nothing about threads
user level thread package• managed by runtime library routines linked
into each application program• require no kernel intervention• efficient
– cost < 10* (cost of procedure call)• flexible: can be customized to the needs of
language or user• view a process as virtual processor
– these virtual processors are being multiplexed across real processors
– may result in poor performance or incorrect behavior (e.g., deadlock caused by the absence of progress)
user level thread pack-ages built on kernel level
thread• base: allocate the same number of threads as the number of processors allocated to a job– when a user thread blocks, it wastes a processor– if it allocates more kernel threads than allocated pro-
cessors, time slicing is needed• problems of busy-wait synchronization• scheduling of idle user thread
• summary– the number of kernel thread a job needs changes– the number of processors allocated changes
– by the scheduling between jobs– by the degree of parallelism
Scheduler Activation • Goal
– performance at the level of user thread– tight integration with kernel
• Scheduler Activation allows– User level thread package that schedules parallel
threads– Kernel level threads that integrates well with system– Two way interaction between thread package and
kernel• scheduler activation (s_a) ?
– kernel level thread that also runs in user space• needs two stacks
Program Start
cpucpucpucpucpu
1. creates a s_a2. assign it to a
cpu
cpucpucpucpucpu
3. upcalls thread package using
the s_a
user level thread package
cpucpucpucpucpu
4. the s_a executesscheduler code inside
the package
s_a is usingkernel stack
s_a is usinguser stack
initializes s_awith thread to run
Thread Creation/Deletion• when need more processors,
– ask kernel for more processors (syscall)– kernel allocates m CPUs (m <=n)– kernel creates m s_a's– each s_a upcalls using the cpu allocated– then, the user level scheduler starts to schedule its
threads• when there is idle processor
– release it to the kernel
Upcall Points
79
Thread Blocking (Time T2)• The thread manager in user space already has
the thread’s stack and TCB• At blocking time, the registers and PC values
are saved by the kernel• The kernel notifies the thread manager of the
fact that the thread is blocked with saved reg-ister values– use new scheduler activation– use the CPU that was being used by the blocked
thread• The thread manager saves the register values
in TCB and marks it as “blocked”
Processor Reallocation from A to B
1. The kernel sends an interrupt to a CPU-a that was used by A– stops the running thread (say Ti)
2. The kernel upcalls to B using the CPU-a with a new scheduler activation– notify B that a new CPU is allocated – the thread manager of B schedules a thread on CPU-a
3. The kernel takes off another CPU-b from A– suppose Tj was running on that CPU
4. The kernel upcalls to A notifying that two threads, Ti and Tj, have been preempted
5. The thread manager of A schedules a thread on CPU-b
System Calls• User level programs notifies the kernel of
events such as– when more processors are needed – when processors are idle
• System Calls (errata: Table III)
Critical Sections• What if a thread holding a lock is preempted(or blocked)
– this problem is not intrinsic to scheduler activation– waste of CPU by waiting threads– deadlock
• a thread holding a lock on “ready queue” is preempted and a scheduler activation upcalls to access the “ready queue”
• Solution– on upcall, the routine checks if the preempted thread was run-
ning in a critical section– if so, schedule the preempted thread on another CPU by pre-
empting a thread that is not in a critical section– the thread releases the CPU as soon as it exits the critical section– this scheme necessitates two kinds of critical section
• normal version• preempted/resumed version
84
Implementation• Modified Topaz kernel thread management system to imple-
ment scheduler activations.• Modified FastThreads to handle the user-level thread system.• Processor allocation policy similar to Zahorjan and McCann’s
dynamic policy. – Processors are shared, guarantee that no processor idles if there is work to
do.– Priorities are respected.– Allocator just needs to know which address spaces need more processors
and which processors are idle.• User-level applications are free to choose any thread schedul-
ing policy they like.• Discarded scheduler activations can be collected and returned
to the kernel for reuse, avoiding the overhead of recreating them.
• Scheduler activations integrated into the Firefly Topaz debug-ger.
85
Conclusion• User-level threads divide up the processor
without the kernel’s knowledge.– Fast and flexible but degrade when I/O and other kernel ac-
tivities get in the way.• Kernel level threads
– Slow and expensive to use.• Managing threads at the user-level is needed
to achieve high performance.– But kernel threads or processes do not support this well.
• Scheduler activations provide an interface be-tween the kernel and the user-level thread pacage.– Kernel is responsible for processor allocation and notifying
the user-level of events that affect it.– User-level is responsible for thread scheduling and notifies
the kernel of events that affect processor allocation deci-sions.
86
Reality with Threads• Good for
– Operatins systems– Parallel scientific applications– GUI
• Bad for– most programmers– even for experts (development is painful)
87
Why Threads are Hard• synchronization• deadlock• hard to debug• module design doesn’t live well with them
88
Why Threads are Hard• Performance Issues
– locking yields low concurrency• fine-grain locking is complex
– OS scheduling intervenes (Scheduler Activation !)• Poor Support
– many libraries are not thread-safe– kernel calls are not multi-threaded– dependencies on OS’es
• Alternatives?– You will see