薛智文 [email protected] cwhsueh/ 100 fall , nov 4, fri 678, dth 104

29
國國國國國國 國國國國國國 薛薛薛 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 100 Fall, Nov 4, Fri 678, DTH 104 國國國國國國 - 國國國 (2) - Virtualization(V 12N)

Upload: candid

Post on 23-Mar-2016

163 views

Category:

Documents


1 download

DESCRIPTION

前瞻 資訊科技 - 虛擬 化 (2) - Virtualization( V12N ) . 薛智文 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 100 Fall , Nov 4, Fri 678, DTH 104. Outline. Introduction Xen Architecture Hypercall CPU Virtualization Memory Virtualization I/O Device Virtualization - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

國立台灣大學資訊工程學系

薛智文[email protected]

http://www.csie.ntu.edu.tw/~cwhsueh/100 Fall, Nov 4, Fri 678, DTH 104

前瞻資訊科技- 虛擬化 (2)

-Virtualization(V1

2N)

Page 2: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

IntroductionXen

ArchitectureHypercallCPU VirtualizationMemory VirtualizationI/O Device VirtualizationHardware Virtual MachineBenchmark

Domain 1Summary

Outline

/282

Page 3: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Binary translation Hypercall

How to Virtualize ?

Full Virtualizat

ion

Para Virtualiza

tion

Hardware Assisted Virtualization

Intel VT-x & AMD SVMTrap and emulate

/283

Page 4: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Hardware

Hypervisor, e.g. Xen

VM0 VM1 VMN…

Virtual Machine Monitor (VMM)Hypervisor

Hardware

Hosted VMM, e.g. VMware

VM0 VM1 VMN…

Host Operating System

Type I - Hypervisor Type II – Hosted VMM

VM : Virtual Machine, Guest OS + Virtual Devices

/284

Page 5: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Hypervisor (VMM) TypeType I + Microkernel

Xen (open source, Citrix),Microsoft Hyper-V

Type I + Integrated kernel VMware ESX, KVM (kernel-base VM)

Type II (Host OS + Guest OS)VMware GSX, workstation,Microsoft virtual PC, Microsoft virtual server, Sun Virtual Box

Type I

Type II

/285

Page 6: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Xen Architecture (1/2)Domain 0 Domain UDomain UDomain U

/286

Page 7: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Xen Architecture (2/2)

Linux XenSystem Calls Hyper CallsSignals EventsInterrupts Physical + Virtual InterruptsCPU PCPU + VCPUFilesystem XenStorePOSIX Shared Memory Grant Tables/Shared Pages

Compare to common Linux

/287

Page 8: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

int 0x80int 0x82

System Call

// xen/include/public/xen.h

#define __HYPERVISOR_set_trap_table 0#define __HYPERVISOR_mmu_update 1#define __HYPERVISOR_set_gdt 2#define __HYPERVISOR_stack_switch 3…

01020304050607

// linux/include/asm/unistd.h

#define __NR_restart_syscall 0#define __NR_exit 1#define __NR_fork 2#define __NR_read 3…

01020304050607

Hyper Call

Guest OS Hypervisor

int 82hhypercall

Hypercall_table

resume Guest OS

HYPERVOSIR_sched_op

do_sched_opiret

Hyper Call

/288

Page 9: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Grant TablePage mapping & Page transferringPage as a unitGrant reference (GR) Grant entry

Domain A Domain B

create GR send GR

informrelease GR

map page

unmap page

access page

Domain A Domain B

transfer page

send GR

create GR

release GR

receive page

inform

/289

Page 10: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Event ChannelA lightweight signal mechanism

Use “ports” as identifers (pending+mask)Four major purposes

Guest OSGuest OS

Hypervisor

Hardware

Virtual CPU VirtualMemory Scheduling

PhysicalCPU

PhysicalMemory Eth1

Eth0

VCPU VCPU … VCPU VCPU …IPI

IDC

vIRQ pIRQ

IPI

/2810

Page 11: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Architecture

2 scheduling algorithms (Non-Work Conserving)Simple Earliest Deadline First (SEDF)Credit

CPU Virtualization

Guest OS

VCPU VCPU

Guest OS

VCPU

PCPU PCPU PCPU …

App App

Hypervisor

Scheduling

/2811

Page 12: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Interrupt

Physical interruptFor the hypervisor or for guest OSes

Virtual interruptAsk guest OSes to do8 for now (max is 24)

PIC

IRQnDevice

OS

Hardware

PIC

IRQnDevice

Guest OS

Hardware

Hypervisor

Guest OS …

ISR

event

/2812

Page 13: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Two-level memoryThree-level memory

Virtual, Pseudo-physical, Machine

Memory Virtualization (1/2)

hypervisor

Application

OS

- Virtual Memory

-Physical Memory

Hypervisor

-Machine Memory

Guest OS-Pseudo-Physical Memory

P2M M2P

/2813

Page 14: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

168M memory for hypervisor

Memory Virtualization (2/2)

Area Size

MPT, Machine-to-Physical Translation Table (RO) 16M

Page-Frame Information 96M

MPT, Machine-to-Physical Translation Table (R/W) 16M

Linear Page Table 8MShadow Linear Page Table 8MPer Domain Mappings 8M

Direct Map 12M

I/O Remap 4M

0xFFFFFFFF

0xFC0000000xFC400000

Heap

/2814

Page 15: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

4 mechanisms to manipulate page tablesParavirtualized page tablesWrite page tables (Only level 1 is writable)Shadow page tablesHardware-assisted paging

Memory Virtualization- Translation

Virtual Memory

Machine Memory

Pseudo-Physical Memory

Page TablePage Fault ! Shadow Page Table

P2M

(VM->PFN) (VM->MFN or VM->P2M)

Second Level PagingHAP

MMU

/2815

Page 16: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Structure

Compare with start_info_page

Memory Virtualization - Shared Info Page

wall clock

event channel

Start Info Page Shared Info PageMapped by Domain Builder Guest OS

Information Static Dynamically Updated

MAX : 32 VCPUs

memory

TSC

/2816

Page 17: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

I/O Device Virtualization

Hypervisor also provides three mechanisms to use devices.

Emulated Devices

Paravirtualized Driver

Pass-through

/2817

Page 18: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

I/O Device Virtualization - Emulated Devices

Implemented by QEMUe.g. sound card, ac97, sb16, etc

QEMU-DM

/2818

Page 19: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

I/O Device Virtualization - Paravirtualized Driver

Split Device Driver ModelAn example of sending packets

Front-End DriverBack-End Driver

Native Driver

/2819

Page 20: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

I/O Device Virtualization - I/O Ring

Without data, it only transfers request/replyAn example with GR

Grant Table

Active Grant Table

Hypervisor

Dom U Dom 0

GR GR

GR

Device

I/O Channel

/2820

Page 21: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

I/O Device Virtualization - Pass-Through

Pass and directly use the device

Dom UDom 0

Hypervisor

Hardware

Virtual CPU VirtualMemory Scheduling

PhysicalCPU

PhysicalMemory

Eth1

NativeDriver

…NativeDriver

Eth0

Dom U

/2821

Page 22: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Hardware Virtual Machine

Intel Virtualization Technology

Technology Description Virtualization Implementation

VT-x Root/NonRootExtended Page Tables CPU, Memory Instructions Set

VT-i As VT-x, for ItaniumVT-d DMA, Interrupt Devices IOMMU (Chipset)VT-c Classify Packets Network Devices VMDq, VMDc

/2822

Page 23: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

CPU Benchmark (1/2)

8.3%

Average over 100 tests, Deviation: 0.066~0.128%

/2823

Page 24: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

CPU Benchmark (2/2)

5%

Calculate the 32M digits of ∏.

/2824

Page 25: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Hard Disk Drive Benchmark

/2825

Page 26: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Network Benchmark (1/2)

Testing Time: 180 seconds, Deviation: 0.12~0.26%.

59%

/2826

Page 27: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Network Benchmark (2/2)

Sample Period: 2 seconds

Average: 9.82%

/2827

Page 28: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

How fast can virtualization achieve?95+% 99.9%

What kinds of applications?Well …

What problems it might incur?Technical

DataSecurityBusinessPoliticsGlobalization (G11N) = Internationalization (I18N) + Localization (L10N)…

Answers for Big Questions

/2828

Page 29: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh/ 100  Fall , Nov 4,  Fri  678, DTH 104

資工系網媒所 NEWS 實驗室

Stay hungry to be full [of passion].Stay foolish to be smart [on absorption].假若真時真亦假Virtualized reality.Real virtualization.Virtualized to go anywhere.Key is the system.System is the key.

E.g. Virtual Tape Library

Summary

/2829