1 a dynamic aspect-oriented system for os kernels yoshisato yanagisawa, kenichi kourai, shigeru...

31
1 A Dynamic Aspect- A Dynamic Aspect- oriented System for OS oriented System for OS Kernels Kernels Yoshisato Yanagisawa, Kenichi Yoshisato Yanagisawa, Kenichi Kourai, Shigeru Chiba, and Rei Kourai, Shigeru Chiba, and Rei Ishikawa Ishikawa Tokyo Institute of Technology. Tokyo Institute of Technology.

Upload: kory-gregory

Post on 30-Dec-2015

227 views

Category:

Documents


3 download

TRANSCRIPT

11

A Dynamic Aspect-oriented A Dynamic Aspect-oriented System for OS KernelsSystem for OS Kernels

Yoshisato Yanagisawa, Kenichi Kourai, SYoshisato Yanagisawa, Kenichi Kourai, Shigeru Chiba, and Rei Ishikawahigeru Chiba, and Rei Ishikawa

Tokyo Institute of Technology.Tokyo Institute of Technology.

22

Let’s insert profiling code Let’s insert profiling code into a running kernelinto a running kernel

For logging time stamps at arbitrary execution For logging time stamps at arbitrary execution pointspoints– Performance tuningPerformance tuning– e.g. An elapse time from a packet arrival till it is e.g. An elapse time from a packet arrival till it is

stored in a kernel buffer.stored in a kernel buffer.

Other approachesOther approaches– Existing kernel profilers?Existing kernel profilers?

Time stamps are logged only at prefixed points.Time stamps are logged only at prefixed points.– Modifying kernel source and reboot?Modifying kernel source and reboot?

It’s annoying and error-prone.It’s annoying and error-prone.

33

Profiling codeProfiling code

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

Linux kernel code (fs/attr.c)

struct timeval tv;do_gettimeofday(&tv) ;print_tv(tv);printk(“%ld”, inode->i_uid);

insert

Profiling code

Record a time stampat a specified source linewith values of variables.

No recompile or reboot!

44

kmgr.findModule(“kernel”, &kmod);kmod.findFuction(“inode_change_ok”, &ifunc);ifunc.findEntryPoint(&entries);kmgr.findModule(“profiler”, &kmod);kmod.findFunction(“print_log”, &pf);hook = kapi_call_expr(pf.getEntryAddr(), args);kmgr.insertSnippet(hook, entries[0]);

Kerninst [Tamches et al. ’99]Kerninst [Tamches et al. ’99]

An on-line kernel instrumentation tool.An on-line kernel instrumentation tool.– Assembly-levelAssembly-level abstraction. abstraction.– Developers should manually calculateDevelopers should manually calculate

the addresses of:the addresses of:machine instructions,machine instructions,

and variables if their values are also logged.and variables if their values are also logged.

void print_log() { … __asm__ (“movl %%ebp, %0” : “=r”(ebp)); uid = ((struct inode*)ebp[11])->i_uid; /* ebp[11] is inode */ …

Sample code for Kerninst to get a log.

55

KLASYKLASY

Kernel-level Aspect-Oriented SystemKernel-level Aspect-Oriented System– Source-levelSource-level abstraction. abstraction.

Thanks to Aspect-Oriented Programming (AOP)Thanks to Aspect-Oriented Programming (AOP)

– Our new implementation schemeOur new implementation schemeallows the users:allows the users:

Specifying arbitrary execution points at source levelSpecifying arbitrary execution points at source level– PointcutsPointcuts

Writing profiling code in CWriting profiling code in C– AdviceAdvice– The code is executed at the specified points.The code is executed at the specified points.– It can access variables available at the execution points.It can access variables available at the execution points.

66

LoggingLogging

A killer application of AOPA killer application of AOP– Logging (or profiling) code should be separateLogging (or profiling) code should be separate

d into an independent module.d into an independent module.

Why a new AOP system?Why a new AOP system?– Existing dynamic AOP systems for CExisting dynamic AOP systems for C

no no context exposurecontext exposure: cannot access variables: cannot access variables

no pointcut of no pointcut of struct-memberstruct-member accesses accesses– We cannot log a time stamp when inode->i_uid is updateWe cannot log a time stamp when inode->i_uid is update

d.d. This is crucial.

77

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

88

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

import a header file used in advice bodies.

99

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

1010

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

access(inode.i_uid) pointcut selects inode->i_uid

1111

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

within_function() limits the selection to within inode_change_ok().

1212

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

Context exposure:set inode_value to the target structure inode.

1313

An example of a KLASY aspectAn example of a KLASY aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

Get a time stamp and store it with the

value of i_uid

1414

Implementation of KLASYImplementation of KLASY

Source-based binary-level dynamic weavingSource-based binary-level dynamic weaving– Modified GNU C compiler (gcc) Modified GNU C compiler (gcc)

produces produces richer symbol informationricher symbol information, which enables:, which enables:– Pointcut of accesses to struct-membersPointcut of accesses to struct-members

– Context exposure (the addresses of variables)Context exposure (the addresses of variables)

– Kerninst as a backendKerninst as a backendto enable to enable dynamic weavingdynamic weaving..

– Advice is written in the C language.Advice is written in the C language.surrounded by XML-like tags.surrounded by XML-like tags.

1515

Compiled advice

An overview of KLASYAn overview of KLASY

Aspect OSsource code

pointcut

Richersymbol

information

Dynamic Weaver

CoreOS kernelHook

OS kernel

Aspect compiler Modifiedgcc

insmod

1616

Compiled advice

An overview of KLASYAn overview of KLASY

Aspect OSsource code

pointcut

Richersymbol

information

Dynamic Weaver

CoreOS kernelHook

OS kernel

Aspect compiler Modifiedgcc

insmod

1717

Our modified gcc compilerOur modified gcc compiler

It collects the following symbol information:It collects the following symbol information:– The line number and the file nameThe line number and the file name

in which a struct-member is accessed, in which a struct-member is accessed,

The The parserparser of the compiler has been of the compiler has been extendedextended..

– The address of the first instruction of each line.The address of the first instruction of each line.

Debug optionDebug option (-g) is (-g) is usedused at compile-time. at compile-time.This is also necessary to enable pointcut of struct-meThis is also necessary to enable pointcut of struct-member accesses.mber accesses.

1818

Compiled advice

An overview of KLASYAn overview of KLASY

Aspect OSsource code

pointcut

Richersymbol

information

Dynamic Weaver

CoreOS kernelHook

OS kernel

Aspect compiler Modifiedgcc

insmod

1919

Dynamic WeaverDynamic Weaver

Uses Uses KerninstKerninst to to insert a hookinsert a hookat an execution point selected by a at an execution point selected by a pointcut.pointcut.

– HookHookA code for calling an A code for calling an adviceadvice body body

– The address at which a hook is insertedThe address at which a hook is insertedAn access to a struct-memberAn access to a struct-member

The line number and the file nameThe line number and the file name

The address of the first instruction of that line.The address of the first instruction of that line.

Symbol Information

2020

UnweavingUnweaving

KLASY can remove woven aspects from the KLASY can remove woven aspects from the OS kernel at runtimeOS kernel at runtime– This feature is important for profiling.This feature is important for profiling.

Users typically need to try various profiling aspects.Users typically need to try various profiling aspects.Users should be able to remove unnecessary aspects to avoid probe effects.

– Users can associate a user-friendly name to an aUsers can associate a user-friendly name to an aspect.spect.

– KLASY also uses Kerninst to remove hooks inserKLASY also uses Kerninst to remove hooks inserted by our weaver.ted by our weaver.

2121

Some supported Some supported Pointcuts and AdvicesPointcuts and Advices

PointcutsPointcuts– accessaccess

Selects access to a struct-member access.Selects access to a struct-member access.

– executionexecutionSelects execution of a function.Selects execution of a function.

AdvicesAdvices– beforebefore

Profiling code will be executed before the selected point.Profiling code will be executed before the selected point.

– afterafterProfiling code will be executed after a the selected point.Profiling code will be executed after a the selected point.

2222

Context exposure in the aspectContext exposure in the aspect

<aspect> <import>linux/time.h</import> <advice><pointcut> access(inode.i_uid) AND within_function(inode_change_ok) AND target(inode_value) </pointcut> <before> struct inode *i = inode_value; struct timeval tv; do_gettimeofday(&amp;tv); print(i-&gt;i_uid, tv.tv_sec, tv.tv_usec); </before> </advice></aspect>

pointcut

advice

int inode_change_ok(…){ … if ((ia_valid & ATTR_UID) && … attr->ia_uid != inode->i_uid) … goto error;

if ((ia_valid & ATTR_GID) && …}

selected

Linux kernel code (fs/attr.c)Aspect

Read a local variable and use it in an advice-body.

2323

Context ExposureContext Exposure

Pointcuts for context exposurePointcuts for context exposure– targettarget

Gets a reference to a value of the struct-member access seleGets a reference to a value of the struct-member access selected by access pointcut.cted by access pointcut.

– local_varlocal_varGets a reference to a local variable at the member access.Gets a reference to a local variable at the member access.

– argumentargumentGets a reference to a function argument of the function selectGets a reference to a function argument of the function selected by execution pointcut.ed by execution pointcut.

Implementation in weaverImplementation in weaver– Requires special information from gcc’s debug optionRequires special information from gcc’s debug option– Strong coupling between weaver and KLASY’s gccStrong coupling between weaver and KLASY’s gcc

2424

Case study:Case study:tracing packet-handlingtracing packet-handling

GoalGoal– To find a performance bottleneck of the Linux network To find a performance bottleneck of the Linux network

I/O subsystem under heavy workloadI/O subsystem under heavy workload

We traced the accesses to struct sk_buffWe traced the accesses to struct sk_buff– Sk_buff is used as a buffer in the network subsystem oSk_buff is used as a buffer in the network subsystem o

f Linuxf LinuxTracing member accesses to sk_buff shows the behavioTracing member accesses to sk_buff shows the behavior of network processingr of network processing

– We needed context exposure provided by KLASYWe needed context exposure provided by KLASYto identify each network packetto identify each network packetto ignore uninteresting packets to ignore uninteresting packets

2525

Aspect for tracingAspect for tracing

<aspect><advice> <pointcut> access(sk_buff.%) AND target(arg0) </pointcut> <before> struct sk_buff *skb = arg0; unsigned long timestamp; if (skb-&gt;protocol != ETH_P_ARP) { STORE_DATA($pc$); STORE_DATA(skb); DO_RDTSC(timestamp); STORE_DATA(timestamp); } </before></advice></aspect>

Ignore ARP packetsStore a program counter,a position of each sk_buf structure, and a time stamp.

Wid-card

skb->protocol

2626

Results of tracingResults of tracingThe results show that the performance bottlenecThe results show that the performance bottleneck is process schedulingk is process scheduling– skb_copy_datagram_iovec is executed during a systeskb_copy_datagram_iovec is executed during a syste

m call issued by a processm call issued by a process

Time scale of packet arrival

0.1 1 10 100 1000Elapsed time

Too much difference

skb_copy_datagram_iovec

1000_clean_rx_irq

netif_receive_skbip_rcv

ip_rcv_finish tcp_v4_do_rcv

tcp_v4_rcv tcp_rcv_established

__kfree_skb

2727

ExperimentExperiment

UnixBench benchmarkUnixBench benchmark– Compare two Linux kernels:Compare two Linux kernels:

Linux: compiled by normal gcc.Linux: compiled by normal gcc.– With –fomit-frame-pointer optimization.With –fomit-frame-pointer optimization.

KLASY: compiled by our modified gcc.KLASY: compiled by our modified gcc.– Cannot use -fomit-frame-pointer optimizationCannot use -fomit-frame-pointer optimization

to enable context exposure.to enable context exposure.

– Measures the overhead of disabling –fomit-frame-pointer Measures the overhead of disabling –fomit-frame-pointer optimizationoptimization

– EnvironmentEnvironmentFedora Core 2 (Linux™ 2.6.10),Kerninst 2.1.1Fedora Core 2 (Linux™ 2.6.10),Kerninst 2.1.1The GNU C Compiler 3.3.3,AMD Athlon™XP 2200+The GNU C Compiler 3.3.3,AMD Athlon™XP 2200+1GB RAM,Intel PRO/1000 Ethernet card1GB RAM,Intel PRO/1000 Ethernet card

2828

Result of UnixBenchResult of UnixBench

The performance of the kernel compiled The performance of the kernel compiled by our compiler is acceptable.by our compiler is acceptable.– Overhead:Overhead:

0 to 12 %0 to 12 %

Average: 4.4 %Average: 4.4 %

0200400600800

1000

dhry2

reg

sysc

allpip

eex

ecl

conte

xt

Linux KLASY

dhry2reg: drystonesyscall: system callpipe: pipe system callexecl: execl system callcontext: context-switch

2929

Related Works (1)Related Works (1)

Dynamic aspect-oriented systems for CDynamic aspect-oriented systems for C– TOSKANA [Engel ’05], DAC++ [Almajali ’05],TOSKANA [Engel ’05], DAC++ [Almajali ’05],

TinyC2 [Zhang ’03],and Arachne [Douence ’05]TinyC2 [Zhang ’03],and Arachne [Douence ’05]

Dynamic code instrumentation.Dynamic code instrumentation.

Only a function call or an execution is a pointcut.Only a function call or an execution is a pointcut.– They don’t use symbol information.They don’t use symbol information.

– TOSKANA-VM [Engel ’05]TOSKANA-VM [Engel ’05]

Running an OS kernel on a virtual machine.Running an OS kernel on a virtual machine.

Not able to profile a kernel on native hardware.Not able to profile a kernel on native hardware.

3030

Related Works (2)Related Works (2)

Static aspect-oriented systems for CStatic aspect-oriented systems for C– Not able to modify a running kernel.Not able to modify a running kernel.– Need reboot to activate profiling codes.Need reboot to activate profiling codes.

e.g. AspectC [Coady ’01], AspectC++ [Spinczyk ’02]e.g. AspectC [Coady ’01], AspectC++ [Spinczyk ’02]

Kernel ProfilersKernel Profilers– LKST, DTrace [Cantrill ’04], SystemTAP [Prasad ’05], anLKST, DTrace [Cantrill ’04], SystemTAP [Prasad ’05], an

d LTT [Yaghmour ’00]d LTT [Yaghmour ’00]Tools for producing log messages about events occurring iTools for producing log messages about events occurring in the kernel.n the kernel.The users can only select some of the pre-defined executiThe users can only select some of the pre-defined execution point.on point.

3131

Concluding RemarksConcluding Remarks

KLASY: kernel-level aspect-oriented systemKLASY: kernel-level aspect-oriented system– source-based binary-level dynamic weavingsource-based binary-level dynamic weaving

Pointcut of struct-member accessesPointcut of struct-member accesses

Context exposureContext exposure

– KLASY was useful for profiling a network I/O suKLASY was useful for profiling a network I/O subsystembsystem

We found that a performance bottleneck was process We found that a performance bottleneck was process schedulingscheduling

KLASY is distributed from http://www.csg.is.titech.ac.jp/~yanagisawa/KLASY/.