docker 原理與實作

39
docker 原理與實作 果凍

Upload: ya790026

Post on 08-Sep-2014

1.484 views

Category:

Technology


0 download

DESCRIPTION

the technology behind docker. This is for osdc.tw 2014

TRANSCRIPT

Page 1: Docker 原理與實作

docker 原理與實作果凍

Page 2: Docker 原理與實作

簡介

● 任職於迎廣科技○ python○ openstack

● http://about.me/ya790206● http://blog.blackwhite.tw/● https://github.com/ya790206/call_seq

Page 3: Docker 原理與實作

Agenda

● linux kernel namespace● seccomp● cgroup● lxc● docker

Page 4: Docker 原理與實作

docker

● lightweight, portable, self-sufficient containers.

● the process running in the container is isolated from the process running in the other container.

Page 5: Docker 原理與實作

Linux startup process

● Linux startup process○ Boot loader -> ○ Kernel -> ○ Init process

● Difference between Linux distros:○ package manager○ init

Page 6: Docker 原理與實作

Docker

Autofs lxc

Kernel namespaces

Apparmor and SELinux profiles

Seccomp policies

Control groups

Kernel capabilities Chroots

btrfs

Page 7: Docker 原理與實作

kernel namespace

● The purpose of each namespace is to wrap a particular global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource.

● Private view

Page 8: Docker 原理與實作

kernel pid namespaceroot pid namespace

pid 1 (pid 1)

pid namespace x pid 2 (pid 2)

pid 3 (pid 1)

pid 4 (pid 2) ● black: the real pid.● red: the pid process use getpid

to get.

Page 9: Docker 原理與實作

kernel namespace

Mount namespacesUTS namespacesPID namespaces Network namespacesUser namespaces IPC namespaces

Page 10: Docker 原理與實作

int child_pid = clone(child_main, child_stack+STACK_SIZE, CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | SIGCHLD, NULL);

● https://gist.github.com/ya790206/9855021

Page 11: Docker 原理與實作

尾巴沒藏好

Page 12: Docker 原理與實作

int child_pid = clone(child_main, child_stack+STACK_SIZE, CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNS | SIGCHLD, NULL);mount("proc", "/proc", "proc", 0, NULL);

● https://gist.github.com/ya790206/9855094

Page 13: Docker 原理與實作

seccomp

● A process running in seccomp mode is severely limited in what it can do;

● there are only four system calls - read(), write(), exit(), and sigreturn() to already-open file descriptors.

Page 14: Docker 原理與實作

libseccomp example

https://gist.github.com/ya790206/9579145

Page 15: Docker 原理與實作

cgroup

● This work was started by engineers at Google

● Resource limiting● Prioritization● Accounting● Control

Page 16: Docker 原理與實作

cgroup○ blkio — this subsystem sets limits on input/output access to and from block devices such as

physical drives (disk, solid state, USB, etc.).○ cpu — this subsystem uses the scheduler to provide cgroup tasks access to the CPU.○ cpuacct — this subsystem generates automatic reports on CPU resources used by tasks in a

cgroup.○ cpuset — this subsystem assigns individual CPUs (on a multicore system) and memory nodes to

tasks in a cgroup.○ devices — this subsystem allows or denies access to devices by tasks in a cgroup.○ freezer — this subsystem suspends or resumes tasks in a cgroup.○ memory — this subsystem sets limits on memory use by tasks in a cgroup, and generates

automatic reports on memory resources used by those tasks.○ net_cls — this subsystem tags network packets with a class identifier (classid) that allows the

Linux traffic controller (tc) to identify packets originating from a particular cgroup task.○ net_prio — this subsystem provides a way to dynamically set the priority of network traffic per

network interface.○ ns — the namespace subsystem.

Page 17: Docker 原理與實作

cgroup freezer

● The cgroup freezer is useful to batch job management system which startand stop sets of tasks in order to schedule the resources of a machineaccording to the desires of a system administrator.

Page 18: Docker 原理與實作

$ mount -t cgroup -ofreezer freezer /<path>/freezer

/<path>/freezer:root cgroup

tasks otherfile my

/<path>/freezer/my:sub cgroup

tasks otherfile

$ mkdir /<path>/freezer/my

all process

pid

Page 19: Docker 原理與實作

cgroup freezer

$ mount -t cgroup -ofreezer freezer /<path>/freezer$ ch /<path>/freezer/; ls cgroup.clone_children cgroup.event_control cgroup.procs cgroup.sane_behavior notify_on_release release_agent tasks

1. mkdir my_group;cd mygroup2. echo $some_pid > tasks3. echo FROZEN > freezer.state4. echo THAWED > freezer.state

Page 20: Docker 原理與實作

other cgroup

● memory cgroup:○ limit process memoroy usage.○ show various statistics

● blkio cgroup:○ change widget○ show various statistics

Page 21: Docker 原理與實作

lxc

● LXC is a userspace interface for the Linux kernel containment features.

● Container templates● A set of standard tools to control the

containers

Page 22: Docker 原理與實作

lxchost os

container A

process 1

process 2

container B

process 3

process 4

process x

A can see BA B A BA can see B.B can see A.

Page 23: Docker 原理與實作

lxc

1. lxc-create -n test-container -t ubuntu2. lxc-ls --fancy3. lxc-start -n test-container4. lxc-console -n test-container5. lxc-stop -n test-container6. lxc-destroy -n test-container

Page 24: Docker 原理與實作

start vs execute

● start:○ boot linux system

● execute:○ execute program directly○ make sure you have "/usr/lib/lxc/lxc-init" in your

container

Page 25: Docker 原理與實作

sudo lxc-checkpoint -name p1 --statefile a● output:

○ lxc-checkpoint: 'checkpoint' function not implemented

Page 26: Docker 原理與實作

linux aufs

● It allows files and directories of separate filesystem to co-exist under a single directories.

/tmp/union

/tmp/a /tmp/b /tmp/c

Page 27: Docker 原理與實作

# apt-get install aufs-tools

# mount -t aufs -o br=/tmp/a:/tmp/b none /tmp/union/

# mount -t aufs -o br=/tmp/a=rw:/tmp/b=rw none /tmp/union

Page 28: Docker 原理與實作

docker vs lxc

● docker is based on lxc● docker can create image from text file.● docker seldom boot system.● docker provide user-friendly interface● docker use less disk space.(aufs)

Page 29: Docker 原理與實作

dockerrunning containers

process

rootfs

stopped containers

rootfs

image

commit

r

un

st

op

st

ar

t

rootfs

Page 30: Docker 原理與實作

rootfs in container

image: rw

ZZZ image: ro

XXX image: ro

ubuntu image: ro

rootfs in image

image: ro

ZZZ image: ro

XXX image: ro

ubuntu image: ro

aufs

aufs

Page 31: Docker 原理與實作

taiwan.py site dockerfile

FROM ubuntu:12.10

RUN apt-get update

RUN apt-get install -y python-dev

RUN apt-get install -y python-pip

RUN apt-get install -y git

RUN pip install mynt

RUN git clone https://github.com/lucemia/taiwan.py

RUN mynt gen -f taiwan.py/src/ taiwan.py/build/

EXPOSE 8000

CMD cd taiwan.py/build/ && python -m SimpleHTTPServer

Page 32: Docker 原理與實作

How to run

1. cat dockerfile | sudo docker build -t taiwanpy -

2. docker run -p 8000:9000 taiwanpy3. docker stop xxx4. docker start xxx5. docker stop xxx6. docker rm xxx7. docker rmi taiwanpy

Page 33: Docker 原理與實作

simple docker shell

● https://github.com/ya790206/misc_tools/tree/master/docker_wrapper

Page 34: Docker 原理與實作

Summary

● Namespace for virtualization.● Cgroup for controlling a group of process.● Conatiner and host system use the same

kernel.● Docker is similar to lxc. But docker is easy

to use.

Page 35: Docker 原理與實作

Question

Page 36: Docker 原理與實作

Thank you

Page 39: Docker 原理與實作

參考書目

● Linux Kernel Hacks:改善效能、提昇開發效率及節能的技巧與工具