薛智文 [email protected] cwhsueh/ 100 fall , nov 4, fri 678, dth 104
DESCRIPTION
前瞻 資訊科技 - 虛擬 化 (2) - Virtualization( V12N ) . 薛智文 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 100 Fall , Nov 4, Fri 678, DTH 104. Outline. Introduction Xen Architecture Hypercall CPU Virtualization Memory Virtualization I/O Device Virtualization - PowerPoint PPT PresentationTRANSCRIPT
國立台灣大學資訊工程學系
http://www.csie.ntu.edu.tw/~cwhsueh/100 Fall, Nov 4, Fri 678, DTH 104
前瞻資訊科技- 虛擬化 (2)
-Virtualization(V1
2N)
資工系網媒所 NEWS 實驗室
IntroductionXen
ArchitectureHypercallCPU VirtualizationMemory VirtualizationI/O Device VirtualizationHardware Virtual MachineBenchmark
Domain 1Summary
Outline
/282
資工系網媒所 NEWS 實驗室
Binary translation Hypercall
How to Virtualize ?
Full Virtualizat
ion
Para Virtualiza
tion
Hardware Assisted Virtualization
Intel VT-x & AMD SVMTrap and emulate
/283
資工系網媒所 NEWS 實驗室
Hardware
Hypervisor, e.g. Xen
VM0 VM1 VMN…
Virtual Machine Monitor (VMM)Hypervisor
Hardware
Hosted VMM, e.g. VMware
VM0 VM1 VMN…
Host Operating System
Type I - Hypervisor Type II – Hosted VMM
VM : Virtual Machine, Guest OS + Virtual Devices
/284
資工系網媒所 NEWS 實驗室
Hypervisor (VMM) TypeType I + Microkernel
Xen (open source, Citrix),Microsoft Hyper-V
Type I + Integrated kernel VMware ESX, KVM (kernel-base VM)
Type II (Host OS + Guest OS)VMware GSX, workstation,Microsoft virtual PC, Microsoft virtual server, Sun Virtual Box
Type I
Type II
/285
資工系網媒所 NEWS 實驗室
Xen Architecture (1/2)Domain 0 Domain UDomain UDomain U
/286
資工系網媒所 NEWS 實驗室
Xen Architecture (2/2)
Linux XenSystem Calls Hyper CallsSignals EventsInterrupts Physical + Virtual InterruptsCPU PCPU + VCPUFilesystem XenStorePOSIX Shared Memory Grant Tables/Shared Pages
Compare to common Linux
/287
資工系網媒所 NEWS 實驗室
int 0x80int 0x82
System Call
// xen/include/public/xen.h
#define __HYPERVISOR_set_trap_table 0#define __HYPERVISOR_mmu_update 1#define __HYPERVISOR_set_gdt 2#define __HYPERVISOR_stack_switch 3…
01020304050607
// linux/include/asm/unistd.h
#define __NR_restart_syscall 0#define __NR_exit 1#define __NR_fork 2#define __NR_read 3…
01020304050607
Hyper Call
Guest OS Hypervisor
int 82hhypercall
Hypercall_table
resume Guest OS
HYPERVOSIR_sched_op
do_sched_opiret
Hyper Call
/288
資工系網媒所 NEWS 實驗室
Grant TablePage mapping & Page transferringPage as a unitGrant reference (GR) Grant entry
Domain A Domain B
create GR send GR
informrelease GR
map page
unmap page
access page
Domain A Domain B
transfer page
send GR
create GR
release GR
receive page
inform
/289
資工系網媒所 NEWS 實驗室
Event ChannelA lightweight signal mechanism
Use “ports” as identifers (pending+mask)Four major purposes
Guest OSGuest OS
Hypervisor
Hardware
Virtual CPU VirtualMemory Scheduling
PhysicalCPU
PhysicalMemory Eth1
…
…
…
Eth0
VCPU VCPU … VCPU VCPU …IPI
IDC
vIRQ pIRQ
IPI
/2810
資工系網媒所 NEWS 實驗室
Architecture
2 scheduling algorithms (Non-Work Conserving)Simple Earliest Deadline First (SEDF)Credit
CPU Virtualization
Guest OS
VCPU VCPU
Guest OS
VCPU
…
…
PCPU PCPU PCPU …
App App
Hypervisor
Scheduling
/2811
資工系網媒所 NEWS 實驗室
Interrupt
Physical interruptFor the hypervisor or for guest OSes
Virtual interruptAsk guest OSes to do8 for now (max is 24)
PIC
IRQnDevice
OS
Hardware
PIC
IRQnDevice
Guest OS
Hardware
Hypervisor
Guest OS …
ISR
event
/2812
資工系網媒所 NEWS 實驗室
Two-level memoryThree-level memory
Virtual, Pseudo-physical, Machine
Memory Virtualization (1/2)
hypervisor
Application
OS
- Virtual Memory
-Physical Memory
Hypervisor
-Machine Memory
Guest OS-Pseudo-Physical Memory
P2M M2P
/2813
資工系網媒所 NEWS 實驗室
168M memory for hypervisor
Memory Virtualization (2/2)
Area Size
MPT, Machine-to-Physical Translation Table (RO) 16M
Page-Frame Information 96M
MPT, Machine-to-Physical Translation Table (R/W) 16M
Linear Page Table 8MShadow Linear Page Table 8MPer Domain Mappings 8M
Direct Map 12M
I/O Remap 4M
0xFFFFFFFF
0xFC0000000xFC400000
Heap
/2814
資工系網媒所 NEWS 實驗室
4 mechanisms to manipulate page tablesParavirtualized page tablesWrite page tables (Only level 1 is writable)Shadow page tablesHardware-assisted paging
Memory Virtualization- Translation
Virtual Memory
Machine Memory
Pseudo-Physical Memory
Page TablePage Fault ! Shadow Page Table
P2M
(VM->PFN) (VM->MFN or VM->P2M)
Second Level PagingHAP
MMU
/2815
資工系網媒所 NEWS 實驗室
Structure
Compare with start_info_page
Memory Virtualization - Shared Info Page
wall clock
event channel
Start Info Page Shared Info PageMapped by Domain Builder Guest OS
Information Static Dynamically Updated
MAX : 32 VCPUs
memory
TSC
/2816
資工系網媒所 NEWS 實驗室
I/O Device Virtualization
Hypervisor also provides three mechanisms to use devices.
Emulated Devices
Paravirtualized Driver
Pass-through
/2817
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - Emulated Devices
Implemented by QEMUe.g. sound card, ac97, sb16, etc
QEMU-DM
/2818
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - Paravirtualized Driver
Split Device Driver ModelAn example of sending packets
Front-End DriverBack-End Driver
Native Driver
/2819
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - I/O Ring
Without data, it only transfers request/replyAn example with GR
Grant Table
Active Grant Table
Hypervisor
Dom U Dom 0
GR GR
GR
Device
I/O Channel
/2820
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - Pass-Through
Pass and directly use the device
Dom UDom 0
Hypervisor
Hardware
Virtual CPU VirtualMemory Scheduling
PhysicalCPU
PhysicalMemory
Eth1
…
…
NativeDriver
…NativeDriver
Eth0
Dom U
/2821
資工系網媒所 NEWS 實驗室
Hardware Virtual Machine
Intel Virtualization Technology
Technology Description Virtualization Implementation
VT-x Root/NonRootExtended Page Tables CPU, Memory Instructions Set
VT-i As VT-x, for ItaniumVT-d DMA, Interrupt Devices IOMMU (Chipset)VT-c Classify Packets Network Devices VMDq, VMDc
/2822
資工系網媒所 NEWS 實驗室
CPU Benchmark (1/2)
8.3%
Average over 100 tests, Deviation: 0.066~0.128%
/2823
資工系網媒所 NEWS 實驗室
CPU Benchmark (2/2)
5%
Calculate the 32M digits of ∏.
/2824
資工系網媒所 NEWS 實驗室
Hard Disk Drive Benchmark
/2825
資工系網媒所 NEWS 實驗室
Network Benchmark (1/2)
Testing Time: 180 seconds, Deviation: 0.12~0.26%.
59%
/2826
資工系網媒所 NEWS 實驗室
Network Benchmark (2/2)
Sample Period: 2 seconds
Average: 9.82%
/2827
資工系網媒所 NEWS 實驗室
How fast can virtualization achieve?95+% 99.9%
What kinds of applications?Well …
What problems it might incur?Technical
DataSecurityBusinessPoliticsGlobalization (G11N) = Internationalization (I18N) + Localization (L10N)…
Answers for Big Questions
/2828
資工系網媒所 NEWS 實驗室
Stay hungry to be full [of passion].Stay foolish to be smart [on absorption].假若真時真亦假Virtualized reality.Real virtualization.Virtualized to go anywhere.Key is the system.System is the key.
E.g. Virtual Tape Library
Summary
/2829