1 a dynamic aspect-oriented system for os kernels yoshisato yanagisawa, kenichi kourai, shigeru...
TRANSCRIPT
11
A Dynamic Aspect-oriented A Dynamic Aspect-oriented System for OS KernelsSystem for OS Kernels
Yoshisato Yanagisawa, Kenichi Kourai, SYoshisato Yanagisawa, Kenichi Kourai, Shigeru Chiba, and Rei Ishikawahigeru Chiba, and Rei Ishikawa
Tokyo Institute of Technology.Tokyo Institute of Technology.
22
Let’s insert profiling code Let’s insert profiling code into a running kernelinto a running kernel
For logging time stamps at arbitrary execution For logging time stamps at arbitrary execution pointspoints– Performance tuningPerformance tuning– e.g. An elapse time from a packet arrival till it is e.g. An elapse time from a packet arrival till it is
stored in a kernel buffer.stored in a kernel buffer.
Other approachesOther approaches– Existing kernel profilers?Existing kernel profilers?
Time stamps are logged only at prefixed points.Time stamps are logged only at prefixed points.– Modifying kernel source and reboot?Modifying kernel source and reboot?
It’s annoying and error-prone.It’s annoying and error-prone.
33
Profiling codeProfiling code
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
Linux kernel code (fs/attr.c)
struct timeval tv;do_gettimeofday(&tv) ;print_tv(tv);printk(“%ld”, inode->i_uid);
insert
Profiling code
Record a time stampat a specified source linewith values of variables.
No recompile or reboot!
44
kmgr.findModule(“kernel”, &kmod);kmod.findFuction(“inode_change_ok”, &ifunc);ifunc.findEntryPoint(&entries);kmgr.findModule(“profiler”, &kmod);kmod.findFunction(“print_log”, &pf);hook = kapi_call_expr(pf.getEntryAddr(), args);kmgr.insertSnippet(hook, entries[0]);
Kerninst [Tamches et al. ’99]Kerninst [Tamches et al. ’99]
An on-line kernel instrumentation tool.An on-line kernel instrumentation tool.– Assembly-levelAssembly-level abstraction. abstraction.– Developers should manually calculateDevelopers should manually calculate
the addresses of:the addresses of:machine instructions,machine instructions,
and variables if their values are also logged.and variables if their values are also logged.
void print_log() { … __asm__ (“movl %%ebp, %0” : “=r”(ebp)); uid = ((struct inode*)ebp[11])->i_uid; /* ebp[11] is inode */ …
Sample code for Kerninst to get a log.
55
KLASYKLASY
Kernel-level Aspect-Oriented SystemKernel-level Aspect-Oriented System– Source-levelSource-level abstraction. abstraction.
Thanks to Aspect-Oriented Programming (AOP)Thanks to Aspect-Oriented Programming (AOP)
– Our new implementation schemeOur new implementation schemeallows the users:allows the users:
Specifying arbitrary execution points at source levelSpecifying arbitrary execution points at source level– PointcutsPointcuts
Writing profiling code in CWriting profiling code in C– AdviceAdvice– The code is executed at the specified points.The code is executed at the specified points.– It can access variables available at the execution points.It can access variables available at the execution points.
66
LoggingLogging
A killer application of AOPA killer application of AOP– Logging (or profiling) code should be separateLogging (or profiling) code should be separate
d into an independent module.d into an independent module.
Why a new AOP system?Why a new AOP system?– Existing dynamic AOP systems for CExisting dynamic AOP systems for C
no no context exposurecontext exposure: cannot access variables: cannot access variables
no pointcut of no pointcut of struct-memberstruct-member accesses accesses– We cannot log a time stamp when inode->i_uid is updateWe cannot log a time stamp when inode->i_uid is update
d.d. This is crucial.
77
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
88
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
import a header file used in advice bodies.
99
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
1010
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
access(inode.i_uid) pointcut selects inode->i_uid
1111
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
within_function() limits the selection to within inode_change_ok().
1212
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
Context exposure:set inode_value to the target structure inode.
1313
An example of a KLASY aspectAn example of a KLASY aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
Get a time stamp and store it with the
value of i_uid
1414
Implementation of KLASYImplementation of KLASY
Source-based binary-level dynamic weavingSource-based binary-level dynamic weaving– Modified GNU C compiler (gcc) Modified GNU C compiler (gcc)
produces produces richer symbol informationricher symbol information, which enables:, which enables:– Pointcut of accesses to struct-membersPointcut of accesses to struct-members
– Context exposure (the addresses of variables)Context exposure (the addresses of variables)
– Kerninst as a backendKerninst as a backendto enable to enable dynamic weavingdynamic weaving..
– Advice is written in the C language.Advice is written in the C language.surrounded by XML-like tags.surrounded by XML-like tags.
1515
Compiled advice
An overview of KLASYAn overview of KLASY
Aspect OSsource code
pointcut
Richersymbol
information
Dynamic Weaver
CoreOS kernelHook
OS kernel
Aspect compiler Modifiedgcc
insmod
1616
Compiled advice
An overview of KLASYAn overview of KLASY
Aspect OSsource code
pointcut
Richersymbol
information
Dynamic Weaver
CoreOS kernelHook
OS kernel
Aspect compiler Modifiedgcc
insmod
1717
Our modified gcc compilerOur modified gcc compiler
It collects the following symbol information:It collects the following symbol information:– The line number and the file nameThe line number and the file name
in which a struct-member is accessed, in which a struct-member is accessed,
The The parserparser of the compiler has been of the compiler has been extendedextended..
– The address of the first instruction of each line.The address of the first instruction of each line.
Debug optionDebug option (-g) is (-g) is usedused at compile-time. at compile-time.This is also necessary to enable pointcut of struct-meThis is also necessary to enable pointcut of struct-member accesses.mber accesses.
1818
Compiled advice
An overview of KLASYAn overview of KLASY
Aspect OSsource code
pointcut
Richersymbol
information
Dynamic Weaver
CoreOS kernelHook
OS kernel
Aspect compiler Modifiedgcc
insmod
1919
Dynamic WeaverDynamic Weaver
Uses Uses KerninstKerninst to to insert a hookinsert a hookat an execution point selected by a at an execution point selected by a pointcut.pointcut.
– HookHookA code for calling an A code for calling an adviceadvice body body
– The address at which a hook is insertedThe address at which a hook is insertedAn access to a struct-memberAn access to a struct-member
The line number and the file nameThe line number and the file name
The address of the first instruction of that line.The address of the first instruction of that line.
Symbol Information
2020
UnweavingUnweaving
KLASY can remove woven aspects from the KLASY can remove woven aspects from the OS kernel at runtimeOS kernel at runtime– This feature is important for profiling.This feature is important for profiling.
Users typically need to try various profiling aspects.Users typically need to try various profiling aspects.Users should be able to remove unnecessary aspects to avoid probe effects.
– Users can associate a user-friendly name to an aUsers can associate a user-friendly name to an aspect.spect.
– KLASY also uses Kerninst to remove hooks inserKLASY also uses Kerninst to remove hooks inserted by our weaver.ted by our weaver.
2121
Some supported Some supported Pointcuts and AdvicesPointcuts and Advices
PointcutsPointcuts– accessaccess
Selects access to a struct-member access.Selects access to a struct-member access.
– executionexecutionSelects execution of a function.Selects execution of a function.
AdvicesAdvices– beforebefore
Profiling code will be executed before the selected point.Profiling code will be executed before the selected point.
– afterafterProfiling code will be executed after a the selected point.Profiling code will be executed after a the selected point.
2222
Context exposure in the aspectContext exposure in the aspect
<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&tv); print(i->i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>
pointcut
advice
int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;
if ((ia_valid & ATTR_GID) && …}
selected
Linux kernel code (fs/attr.c)Aspect
Read a local variable and use it in an advice-body.
2323
Context ExposureContext Exposure
Pointcuts for context exposurePointcuts for context exposure– targettarget
Gets a reference to a value of the struct-member access seleGets a reference to a value of the struct-member access selected by access pointcut.cted by access pointcut.
– local_varlocal_varGets a reference to a local variable at the member access.Gets a reference to a local variable at the member access.
– argumentargumentGets a reference to a function argument of the function selectGets a reference to a function argument of the function selected by execution pointcut.ed by execution pointcut.
Implementation in weaverImplementation in weaver– Requires special information from gcc’s debug optionRequires special information from gcc’s debug option– Strong coupling between weaver and KLASY’s gccStrong coupling between weaver and KLASY’s gcc
2424
Case study:Case study:tracing packet-handlingtracing packet-handling
GoalGoal– To find a performance bottleneck of the Linux network To find a performance bottleneck of the Linux network
I/O subsystem under heavy workloadI/O subsystem under heavy workload
We traced the accesses to struct sk_buffWe traced the accesses to struct sk_buff– Sk_buff is used as a buffer in the network subsystem oSk_buff is used as a buffer in the network subsystem o
f Linuxf LinuxTracing member accesses to sk_buff shows the behavioTracing member accesses to sk_buff shows the behavior of network processingr of network processing
– We needed context exposure provided by KLASYWe needed context exposure provided by KLASYto identify each network packetto identify each network packetto ignore uninteresting packets to ignore uninteresting packets
2525
Aspect for tracingAspect for tracing
<aspect><advice> <pointcut> access(sk_buff.%) AND target(arg0) </pointcut> <before> struct sk_buff *skb = arg0; unsigned long timestamp; if (skb->protocol != ETH_P_ARP) { STORE_DATA($pc$); STORE_DATA(skb); DO_RDTSC(timestamp); STORE_DATA(timestamp); } </before></advice></aspect>
Ignore ARP packetsStore a program counter,a position of each sk_buf structure, and a time stamp.
Wid-card
skb->protocol
2626
Results of tracingResults of tracingThe results show that the performance bottlenecThe results show that the performance bottleneck is process schedulingk is process scheduling– skb_copy_datagram_iovec is executed during a systeskb_copy_datagram_iovec is executed during a syste
m call issued by a processm call issued by a process
Time scale of packet arrival
0.1 1 10 100 1000Elapsed time
Too much difference
skb_copy_datagram_iovec
1000_clean_rx_irq
netif_receive_skbip_rcv
ip_rcv_finish tcp_v4_do_rcv
tcp_v4_rcv tcp_rcv_established
__kfree_skb
2727
ExperimentExperiment
UnixBench benchmarkUnixBench benchmark– Compare two Linux kernels:Compare two Linux kernels:
Linux: compiled by normal gcc.Linux: compiled by normal gcc.– With –fomit-frame-pointer optimization.With –fomit-frame-pointer optimization.
KLASY: compiled by our modified gcc.KLASY: compiled by our modified gcc.– Cannot use -fomit-frame-pointer optimizationCannot use -fomit-frame-pointer optimization
to enable context exposure.to enable context exposure.
– Measures the overhead of disabling –fomit-frame-pointer Measures the overhead of disabling –fomit-frame-pointer optimizationoptimization
– EnvironmentEnvironmentFedora Core 2 (Linux™ 2.6.10),Kerninst 2.1.1Fedora Core 2 (Linux™ 2.6.10),Kerninst 2.1.1The GNU C Compiler 3.3.3,AMD Athlon™XP 2200+The GNU C Compiler 3.3.3,AMD Athlon™XP 2200+1GB RAM,Intel PRO/1000 Ethernet card1GB RAM,Intel PRO/1000 Ethernet card
2828
Result of UnixBenchResult of UnixBench
The performance of the kernel compiled The performance of the kernel compiled by our compiler is acceptable.by our compiler is acceptable.– Overhead:Overhead:
0 to 12 %0 to 12 %
Average: 4.4 %Average: 4.4 %
0200400600800
1000
dhry2
reg
sysc
allpip
eex
ecl
conte
xt
Linux KLASY
dhry2reg: drystonesyscall: system callpipe: pipe system callexecl: execl system callcontext: context-switch
2929
Related Works (1)Related Works (1)
Dynamic aspect-oriented systems for CDynamic aspect-oriented systems for C– TOSKANA [Engel ’05], DAC++ [Almajali ’05],TOSKANA [Engel ’05], DAC++ [Almajali ’05],
TinyC2 [Zhang ’03],and Arachne [Douence ’05]TinyC2 [Zhang ’03],and Arachne [Douence ’05]
Dynamic code instrumentation.Dynamic code instrumentation.
Only a function call or an execution is a pointcut.Only a function call or an execution is a pointcut.– They don’t use symbol information.They don’t use symbol information.
– TOSKANA-VM [Engel ’05]TOSKANA-VM [Engel ’05]
Running an OS kernel on a virtual machine.Running an OS kernel on a virtual machine.
Not able to profile a kernel on native hardware.Not able to profile a kernel on native hardware.
3030
Related Works (2)Related Works (2)
Static aspect-oriented systems for CStatic aspect-oriented systems for C– Not able to modify a running kernel.Not able to modify a running kernel.– Need reboot to activate profiling codes.Need reboot to activate profiling codes.
e.g. AspectC [Coady ’01], AspectC++ [Spinczyk ’02]e.g. AspectC [Coady ’01], AspectC++ [Spinczyk ’02]
Kernel ProfilersKernel Profilers– LKST, DTrace [Cantrill ’04], SystemTAP [Prasad ’05], anLKST, DTrace [Cantrill ’04], SystemTAP [Prasad ’05], an
d LTT [Yaghmour ’00]d LTT [Yaghmour ’00]Tools for producing log messages about events occurring iTools for producing log messages about events occurring in the kernel.n the kernel.The users can only select some of the pre-defined executiThe users can only select some of the pre-defined execution point.on point.
3131
Concluding RemarksConcluding Remarks
KLASY: kernel-level aspect-oriented systemKLASY: kernel-level aspect-oriented system– source-based binary-level dynamic weavingsource-based binary-level dynamic weaving
Pointcut of struct-member accessesPointcut of struct-member accesses
Context exposureContext exposure
– KLASY was useful for profiling a network I/O suKLASY was useful for profiling a network I/O subsystembsystem
We found that a performance bottleneck was process We found that a performance bottleneck was process schedulingscheduling
KLASY is distributed from http://www.csg.is.titech.ac.jp/~yanagisawa/KLASY/.