ylug 110th kpatch code reading
DESCRIPTION
Kpatch is a tiny live-patching tool/function for linux kernel which is under development. This explains how the kpatch kernel module works in Japanese.TRANSCRIPT
© Hitachi, Ltd. 2014. All rights reserved.
YLUG 110th Kernel Reading
Kpatchを読むYLUG カーネル読書会資
料
株式会社 日立製作所 横浜研究所 Linuxテクノロジ研究プロジェクト
2014/5/2
平松雅巳
© Hitachi, Ltd. 2014. All rights reserved.
1. カーネルライブパッチ kpatch
2. Kpatchのカーネルモジュールを読む
3. その他
Contents
2
© Hitachi, Ltd. 2014. All rights reserved.
1. カーネルライブパッチ Kpatch
3
© Hitachi, Ltd. 2014. All rights reserved.
1-1. カーネルライブパッチ kpatch とは
4
動作中のLinuxをリブートせずにパッチを当てる– Non-stop systemへのセキュリティ修正– 軽微な修正の適用– メンテ期間までのつなぎ
Red Hatの開発者が中心になって開発– Oracleのksplice、SuSEのkGraftへの対抗?– 開発初期からGitHubを使ってオープンに開発– カーネルモジュールだけなので導入が楽
• とはいえ、将来的にはカーネルとの統合が必要 Hot patchをカーネルの差分から作成
– 詳しくは@masami256さんのblogで解説• http://kernhack.hatenablog.com/entry/2014/03/08/144026
© Hitachi, Ltd. 2014. All rights reserved.
ライブパッチ機能の提案状況
1-2. (カーネル)ライブパッチの系譜
5
Pannus Live Patch (2004-2006)http://pannus.sourceforge.net/
Livepatch (2005)http://ukai.jp/Slides/2005/1202-b2con/mop/livepatch.html
ksplice (2009-)https://www.ksplice.com/
kGraft (2014)
kpatch (2014)
2004
アプリケーション
カーネル
2009 2014
・メジャーディストリビュータが作っている (RedHat, SuSE)・カーネルの既存機能を再利用
MITの学生がベンチャー立ち上げ最終的にOracleに買収された
CGL対応のため開発されたディストリビュータサポートなし
上記に触発され開発・ptraceとmmap
© Hitachi, Ltd. 2014. All rights reserved.
1-3. ライブパッチの課題
6
パッチを当てる方法– 命令ブロック単位での置換
• 命令数が変化し、実装が困難– 関数単位での置換
• Ftraceやkprobesなどの置換方法がある一貫性の処理(トランザクション)
– バイナリ差分に含まれるパッチ全てが同時に適用されなければならない• 全ての古い関数は新しい関数に同時に変更される
– 全てのスレッドで同時にパッチが適用されなければならない• 古い関数と新しい関数が同時に動かない
– 上記が守られないならパッチを当ててはならない
© Hitachi, Ltd. 2014. All rights reserved.
1-4. Kpatch on GitHub
7
Kpatchの開発はGithub上で行われている
カーネルにパッチを当てる機能はkmod/core
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(1)
8
Kmod/coreのコードサイズ
これなら楽勝? ヘッダファイル kpatch.h
$ wc -l core.c kpatch.h 337 core.c 43 kpatch.h 380 total
$ wc -l core.c kpatch.h 337 core.c 43 kpatch.h 380 total
極小
#include <linux/types.h>
struct kpatch_func { unsigned long new_addr; unsigned long old_addr; unsigned long old_size; struct module *mod; struct hlist_node node;};
extern int kpatch_register(struct module *mod, struct kpatch_func *funcs, int num_funcs);extern int kpatch_unregister(struct module *mod, struct kpatch_func *funcs, int num_funcs);
#include <linux/types.h>
struct kpatch_func { unsigned long new_addr; unsigned long old_addr; unsigned long old_size; struct module *mod; struct hlist_node node;};
extern int kpatch_register(struct module *mod, struct kpatch_func *funcs, int num_funcs);extern int kpatch_unregister(struct module *mod, struct kpatch_func *funcs, int num_funcs);
パッチ対象の関数の情報っぽい
APIは2つだけ
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(2)
9
• Kmod/core/core.c
先頭に方式の説明• 各パッチモジュールはkpatchモジュールを使って新しい関数を追加する
• Stop_machineを使って古い関数が使われていないことを確認してから新しい関数のアドレスに書き換える
• 各書き換え対象の関数ではftraceを使って新しい関数へIPを書き換える
/* Contains the code for the core kpatch module. Each patch module registers * with this module to redirect old functions to new functions. * * Each patch module can contain one or more new functions. This information * is contained in the .patches section of the patch module. For each function * patched by the module we must: * - Call stop_machine * - Ensure that no execution thread is currently in the old function (or has * it in the call stack) * - Add the new function address to the kpatch_funcs table * * After that, each call to the old function calls into kpatch_ftrace_handler() * which finds the new function in the kpatch_funcs table and updates the * return instruction pointer so that ftrace will return to the new function. */
/* Contains the code for the core kpatch module. Each patch module registers * with this module to redirect old functions to new functions. * * Each patch module can contain one or more new functions. This information * is contained in the .patches section of the patch module. For each function * patched by the module we must: * - Call stop_machine * - Ensure that no execution thread is currently in the old function (or has * it in the call stack) * - Add the new function address to the kpatch_funcs table * * After that, each call to the old function calls into kpatch_ftrace_handler() * which finds the new function in the kpatch_funcs table and updates the * return instruction pointer so that ftrace will return to the new function. */
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(3)
10
int kpatch_register(struct module *mod, struct kpatch_func *funcs, int num_funcs){[...]
down(&kpatch_mutex);
for (i = 0; i < num_funcs; i++) { struct kpatch_func *f, *func = &funcs[i];[...] func->mod = mod;[...] /* Add an ftrace handler for this function. */ ret = ftrace_set_filter_ip(&kpatch_ftrace_ops, func->old_addr, 0, 0);[...] }
/* Register the ftrace trampoline if it hasn't been done already. */ if (!kpatch_num_registered++) { ret = register_ftrace_function(&kpatch_ftrace_ops);[...]
/* * Idle the CPUs, verify activeness safety, and atomically make the new * functions visible to the trampoline. */ ret = stop_machine(kpatch_apply_patch, &args, NULL); if (ret) {[...] up(&kpatch_mutex); return ret;
int kpatch_register(struct module *mod, struct kpatch_func *funcs, int num_funcs){[...]
down(&kpatch_mutex);
for (i = 0; i < num_funcs; i++) { struct kpatch_func *f, *func = &funcs[i];[...] func->mod = mod;[...] /* Add an ftrace handler for this function. */ ret = ftrace_set_filter_ip(&kpatch_ftrace_ops, func->old_addr, 0, 0);[...] }
/* Register the ftrace trampoline if it hasn't been done already. */ if (!kpatch_num_registered++) { ret = register_ftrace_function(&kpatch_ftrace_ops);[...]
/* * Idle the CPUs, verify activeness safety, and atomically make the new * functions visible to the trampoline. */ ret = stop_machine(kpatch_apply_patch, &args, NULL); if (ret) {[...] up(&kpatch_mutex); return ret;
Register APIを読む
FtraceにIPアドレスを追加
一回目ならFtraceを初期化
Stop_machineを使ってapply_patchを呼び出す
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(4)
11
• stop_machineの処理struct kpatch_stop_machine_args { struct kpatch_func *funcs; int num_funcs;};
/* Called from stop_machine */static int kpatch_apply_patch(void *data){ struct kpatch_stop_machine_args *args = data; struct kpatch_func *funcs = args->funcs; int num_funcs = args->num_funcs; int i, ret;
ret = kpatch_verify_activeness_safety(funcs, num_funcs); if (ret) goto out;
for (i = 0; i < num_funcs; i++) { struct kpatch_func *func = &funcs[i];
/* update the global list and go live */ hash_add(kpatch_func_hash, &func->node, func->old_addr); }
out: return ret;}
struct kpatch_stop_machine_args { struct kpatch_func *funcs; int num_funcs;};
/* Called from stop_machine */static int kpatch_apply_patch(void *data){ struct kpatch_stop_machine_args *args = data; struct kpatch_func *funcs = args->funcs; int num_funcs = args->num_funcs; int i, ret;
ret = kpatch_verify_activeness_safety(funcs, num_funcs); if (ret) goto out;
for (i = 0; i < num_funcs; i++) { struct kpatch_func *func = &funcs[i];
/* update the global list and go live */ hash_add(kpatch_func_hash, &func->node, func->old_addr); }
out: return ret;}
全てのスレッドについて古い関数を使っていないことを確認
誰かが使っていたら失敗
ハッシュテーブルの更新Stop_machineで止めているので
NMI以外は安全
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(5)
12
struct stacktrace_ops kpatch_backtrace_ops = { .address = kpatch_backtrace_address_verify, .stack = kpatch_backtrace_stack, .walk_stack = print_context_stack_bp,};/* * Verify activeness safety, i.e. that none of the to-be-patched functions are * on the stack of any task. * * This function is called from stop_machine() context. */static int kpatch_verify_activeness_safety(struct kpatch_func *funcs, int num_funcs){ struct task_struct *g, *t; int ret = 0;
struct kpatch_backtrace_args args = { .funcs = funcs, .num_funcs = num_funcs, .ret = 0 };
/* Check the stacks of all tasks. */ do_each_thread(g, t) { dump_trace(t, NULL, NULL, 0, &kpatch_backtrace_ops, &args); if (args.ret) { ret = args.ret; goto out; } } while_each_thread(g, t);
out: return ret;}
struct stacktrace_ops kpatch_backtrace_ops = { .address = kpatch_backtrace_address_verify, .stack = kpatch_backtrace_stack, .walk_stack = print_context_stack_bp,};/* * Verify activeness safety, i.e. that none of the to-be-patched functions are * on the stack of any task. * * This function is called from stop_machine() context. */static int kpatch_verify_activeness_safety(struct kpatch_func *funcs, int num_funcs){ struct task_struct *g, *t; int ret = 0;
struct kpatch_backtrace_args args = { .funcs = funcs, .num_funcs = num_funcs, .ret = 0 };
/* Check the stacks of all tasks. */ do_each_thread(g, t) { dump_trace(t, NULL, NULL, 0, &kpatch_backtrace_ops, &args); if (args.ret) { ret = args.ret; goto out; } } while_each_thread(g, t);
out: return ret;}
全てのスレッド(タスク構造体)を列挙
バックトレースのダンプKpatch_backtrace_opsで処理
本体はkpatch_backtrace_address_verify
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(6)
13
#define KPATCH_HASH_BITS 8DEFINE_HASHTABLE(kpatch_func_hash, KPATCH_HASH_BITS);
DEFINE_SEMAPHORE(kpatch_mutex);
static int kpatch_num_registered;
struct kpatch_backtrace_args { struct kpatch_func *funcs; int num_funcs, ret;};
void kpatch_backtrace_address_verify(void *data, unsigned long address, int reliable){ struct kpatch_backtrace_args *args = data; struct kpatch_func *funcs = args->funcs; int i, num_funcs = args->num_funcs;
if (args->ret) return;
for (i = 0; i < num_funcs; i++) { struct kpatch_func *func = &funcs[i];
if (address >= func->old_addr && address < func->old_addr + func->old_size) { printk("kpatch: activeness safety check failed for " "function at address " "'%lx()'\n", func->old_addr); args->ret = -EBUSY; return; } }}
#define KPATCH_HASH_BITS 8DEFINE_HASHTABLE(kpatch_func_hash, KPATCH_HASH_BITS);
DEFINE_SEMAPHORE(kpatch_mutex);
static int kpatch_num_registered;
struct kpatch_backtrace_args { struct kpatch_func *funcs; int num_funcs, ret;};
void kpatch_backtrace_address_verify(void *data, unsigned long address, int reliable){ struct kpatch_backtrace_args *args = data; struct kpatch_func *funcs = args->funcs; int i, num_funcs = args->num_funcs;
if (args->ret) return;
for (i = 0; i < num_funcs; i++) { struct kpatch_func *func = &funcs[i];
if (address >= func->old_addr && address < func->old_addr + func->old_size) { printk("kpatch: activeness safety check failed for " "function at address " "'%lx()'\n", func->old_addr); args->ret = -EBUSY; return; } }}
パッチ対象の関数のハッシュテーブル
安全確認のためのバックトレース処理ルーチン
バックトレースのアドレスが書き換え対象の関数に含まれてるかを確認
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む(7)
14
• Ftraceで実行する関数を書き換えている所
void notrace kpatch_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct ftrace_ops *op, struct pt_regs *regs){ struct kpatch_func *f;
/* * This is where the magic happens. Update regs->ip to tell ftrace to * return to the new function. * * If there are multiple patch modules that have registered to patch * the same function, the last one to register wins, as it'll be first * in the hash bucket. */ preempt_disable_notrace(); hash_for_each_possible(kpatch_func_hash, f, node, ip) { if (f->old_addr == ip) { regs->ip = f->new_addr; break; } } preempt_enable_notrace();}
void notrace kpatch_ftrace_handler(unsigned long ip, unsigned long parent_ip, struct ftrace_ops *op, struct pt_regs *regs){ struct kpatch_func *f;
/* * This is where the magic happens. Update regs->ip to tell ftrace to * return to the new function. * * If there are multiple patch modules that have registered to patch * the same function, the last one to register wins, as it'll be first * in the hash bucket. */ preempt_disable_notrace(); hash_for_each_possible(kpatch_func_hash, f, node, ip) { if (f->old_addr == ip) { regs->ip = f->new_addr; break; } } preempt_enable_notrace();}
関数のアドレスをハッシュ値にして新関数を取得
IPレジスタを更新
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む 補足(1) ftrace
15
static struct ftrace_ops kpatch_ftrace_ops __read_mostly = { .func = kpatch_ftrace_handler, .flags = FTRACE_OPS_FL_SAVE_REGS,};
static struct ftrace_ops kpatch_ftrace_ops __read_mostly = { .func = kpatch_ftrace_handler, .flags = FTRACE_OPS_FL_SAVE_REGS,};
(include/linux/ftrace.h)# define FTRACE_REGS_ADDR ((unsigned long)ftrace_regs_caller)
(kernel/trace/ftrace.c)static int__ftrace_replace_code(struct dyn_ftrace *rec, int enable)... case FTRACE_UPDATE_MODIFY_CALL: if (rec->flags & FTRACE_FL_REGS) ftrace_old_addr = (unsigned long)FTRACE_ADDR; else ftrace_old_addr = (unsigned long)FTRACE_REGS_ADDR;
return ftrace_modify_call(rec, ftrace_old_addr, ftrace_addr);
(include/linux/ftrace.h)# define FTRACE_REGS_ADDR ((unsigned long)ftrace_regs_caller)
(kernel/trace/ftrace.c)static int__ftrace_replace_code(struct dyn_ftrace *rec, int enable)... case FTRACE_UPDATE_MODIFY_CALL: if (rec->flags & FTRACE_FL_REGS) ftrace_old_addr = (unsigned long)FTRACE_ADDR; else ftrace_old_addr = (unsigned long)FTRACE_REGS_ADDR;
return ftrace_modify_call(rec, ftrace_old_addr, ftrace_addr);
FtraceハンドラにはあらかじめSAVE_REGSフラグを追加
SAVE_REGSフラグがあるとftrace_regs_callerを選択
mcount(fentry)を呼び出すcall命令の行先書き換え
© Hitachi, Ltd. 2014. All rights reserved.
Kpatchのコードを読む 補足(2) ftrace_regs_caller
16
(arch/x86/kernel/entry_64.S)ENTRY(ftrace_regs_caller) /* Save the current flags before compare (in SS location)*/ pushfq ...
/* Save the rest of pt_regs */ movq %r15, R15(%rsp) movq %r14, R14(%rsp) ... /* regs go into 4th parameter */ leaq (%rsp), %rcx
GLOBAL(ftrace_regs_call) call ftrace_stub
/* Copy flags back to SS, to restore them */ movq EFLAGS(%rsp), %rax movq %rax, SS(%rsp)
/* Handlers can change the RIP */ movq RIP(%rsp), %rax movq %rax, SS+8(%rsp)
/* restore the rest of pt_regs */ movq R15(%rsp), %r15 movq R14(%rsp), %r14 ...
(arch/x86/kernel/entry_64.S)ENTRY(ftrace_regs_caller) /* Save the current flags before compare (in SS location)*/ pushfq ...
/* Save the rest of pt_regs */ movq %r15, R15(%rsp) movq %r14, R14(%rsp) ... /* regs go into 4th parameter */ leaq (%rsp), %rcx
GLOBAL(ftrace_regs_call) call ftrace_stub
/* Copy flags back to SS, to restore them */ movq EFLAGS(%rsp), %rax movq %rax, SS(%rsp)
/* Handlers can change the RIP */ movq RIP(%rsp), %rax movq %rax, SS+8(%rsp)
/* restore the rest of pt_regs */ movq R15(%rsp), %r15 movq R14(%rsp), %r14 ...
実際にはここでユーザ定義ハンドラが呼ばれる
ftrace_regs_callerではpt_regsを保存してハンドラ呼ぶ
ここで「トリック」スタック上のリターンアドレスを
新しいIPに変更する
© Hitachi, Ltd. 2014. All rights reserved.
まとめ
17
• Kpatchは非常に単純で安全– Ftraceでパッチ対象の関数をフック→新しい関数へ
–古い関数と新しい関数を同時に動かさない• Stop_machineによって並列実行しないようにする• 全スレッドのバックトレースを使って、対象の関数が実行中かどうかを確認してから適用
–自己書き換えコードなし• 全てftraceに一任
© Hitachi, Ltd. 2014. All rights reserved.
Kpatch後日譚
18
Kmod/coreのコードサイズ(5/2現在)
From Git log
$ wc -l core.c kpatch.h 559 core.c 60 kpatch.h 619 total
$ wc -l core.c kpatch.h 559 core.c 60 kpatch.h 619 total
約2倍?
git log --log-size --since Apr.10 core.c
commit 2f34cf9a895c411f2f89f6cb5ed65272c6c28b04log size 1937Author: Josh Poimboeuf <[email protected]>Date: Mon Apr 28 11:41:20 2014 -0500 kmod/core: NMI synchronization improvements…commit 42e0779c0cb2bbf7f3fdc8860b7182f496a515f2log size 2332Author: Masami Hiramatsu <[email protected]>Date: Wed Apr 23 10:58:45 2014 +0900 kmod/core: Support live patching on NMI handlers
git log --log-size --since Apr.10 core.c
commit 2f34cf9a895c411f2f89f6cb5ed65272c6c28b04log size 1937Author: Josh Poimboeuf <[email protected]>Date: Mon Apr 28 11:41:20 2014 -0500 kmod/core: NMI synchronization improvements…commit 42e0779c0cb2bbf7f3fdc8860b7182f496a515f2log size 2332Author: Masami Hiramatsu <[email protected]>Date: Wed Apr 23 10:58:45 2014 +0900 kmod/core: Support live patching on NMI handlers
NMI対応の改善
NMI対応
© Hitachi, Ltd. 2014. All rights reserved.
後日譚2
19
• KpatchとkGraftがLKMLに投稿された– kGraftはツール付16パッチ
● http://thread.gmane.org/gmane.linux.kernel/1694304
– kpatchはツールなし2パッチ● https://lkml.org/lkml/2014/5/1/273● ツールはgithubで。● ちなみに、当然最新のパッチなので本日読んだコードから
はかなり乖離しています。
© Hitachi, Ltd. 2014. All rights reserved.
関数ごとにセクション化し、カーネル全体をコンパイル→差分のある関数だけ取り出しカーネルモジュール化
■パッチの生成方法の肝はほぼ同じ
補足 kGraftとkpatch
20
どちらもftraceのインタフェースを利用して、関数単位で置き換える・Kpatchは割とコンサバティブ(stop_machineを利用)・kGraftは結構アグレッシブ(stop_machineを使わず、ユーザ空間を見て判断)
トランザクション処理的にはどうなの・・・?
■カーネルへのパッチ当て方法がちょっと違う
・LinuxCon Japanなどで意識合わせをする予定・まだ開発が進んでいる最中なので、今後統合されるかもしれない
■どっちがいいの?
© Hitachi, Ltd. 2014. All rights reserved.
平松雅巳
株式会社 日立製作所 横浜研究所 Linuxテクノロジ研究プロジェクト
YLUGカーネル読書会資料Kpatchを読む
END
21
2014/5/2
© Hitachi, Ltd. 2014. All rights reserved.
Trademarks
22
• Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
• Oracle and Java are registered trademarks of Oracle and/or its affiliates.
• Red Hat is a registered trademark of Red Hat, Inc. in the United States and other countries.
• Other company, product, or service names may be trademarks or service marks of others.