薛智文 [email protected] csie.ntu.tw/~cwhsueh
DESCRIPTION
前瞻 資訊科技 (II) - 虛擬化 (2) - Virtualization(V12N ) . 薛智文 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 101 Spring, March 22, Fri 678, DTH 104. Outline. Case Study Xen Architecture Hypercall CPU Virtualization Memory Virtualization I/O Device Virtualization - PowerPoint PPT PresentationTRANSCRIPT
國立台灣大學資訊工程學系
http://www.csie.ntu.edu.tw/~cwhsueh/101 Spring, March 22, Fri 678, DTH 104
前瞻資訊科技(II)
- 虛擬化 (2)-
Virtualization(V12N)
資工系網媒所 NEWS 實驗室
Case StudyXen
ArchitectureHypercallCPU VirtualizationMemory VirtualizationI/O Device VirtualizationHardware Virtual MachineBenchmark
Domain XKVMUbitusBitCoinWeOS
Summary
Outline
/492
資工系網媒所 NEWS 實驗室
Binary translation Hypercall
How to Virtualize ?
Full Virtualizat
ion
Para Virtualiza
tion
Hardware Assisted Virtualization
Intel VT-x & AMD SVMTrap and emulate
/493
資工系網媒所 NEWS 實驗室
Hardware
Hypervisor, e.g. Xen, VMware
VM0 VM1 VMN…
Virtual Machine Monitor (VMM)Hypervisor
Hardware
Hosted VMM, e.g. KVM, VMware
VM0 VM1 VMN…
Host Operating System
Type I - Hypervisor Type II – Hosted VMM
VM : Virtual Machine, Guest OS + Virtual Devices
/494
資工系網媒所 NEWS 實驗室
Hypervisor (VMM) TypeType I + Microkernel
Xen (open source, Citrix),Microsoft Hyper-V
Type I + Integrated kernel VMware ESX, KVM (kernel-base VM)
Type II (Host OS + Guest OS)VMware GSX, workstation,Microsoft virtual PC, Microsoft virtual server, Sun Virtual Box
Type I
Type II
/495
資工系網媒所 NEWS 實驗室
Xen Architecture (1/2)Domain 0 Domain UDomain UDomain U
/496
QEMU
資工系網媒所 NEWS 實驗室
Xen Architecture (2/2)
Linux XenSystem Calls Hyper CallsSignals EventsInterrupts Physical + Virtual InterruptsCPU PCPU + VCPUFilesystem XenStorePOSIX Shared Memory Grant Tables/Shared Pages
Compare to common Linux
/497
資工系網媒所 NEWS 實驗室
int 0x80int 0x82
System Call
// xen/include/public/xen.h
#define __HYPERVISOR_set_trap_table 0#define __HYPERVISOR_mmu_update 1#define __HYPERVISOR_set_gdt 2#define __HYPERVISOR_stack_switch 3…
01020304050607
// linux/include/asm/unistd.h
#define __NR_restart_syscall 0#define __NR_exit 1#define __NR_fork 2#define __NR_read 3…
01020304050607
Hyper Call
Guest OS Hypervisor
int 82hhypercall
Hypercall_table
resume Guest OS
HYPERVOSIR_sched_op
do_sched_opiret
Hyper Call
/498
資工系網媒所 NEWS 實驗室
Grant TablePage mapping & Page transferringPage as a unitGrant reference (GR) Grant entry
Domain A Domain B
create GR send GR
informrelease GR
map page
unmap page
access page
Domain A Domain B
transfer page
send GR
create GR
release GR
receive page
inform
/499
資工系網媒所 NEWS 實驗室
Xen Architecture (1/2)Domain 0 Domain UDomain UDomain U
/4910
資工系網媒所 NEWS 實驗室
Event ChannelA lightweight signal mechanism
Use “ports” as identifers (pending+mask)Four major purposes
Guest OSGuest OS
Hypervisor
Hardware
Virtual CPU VirtualMemory Scheduling
PhysicalCPU
PhysicalMemory Eth1
…
…
…
Eth0
VCPU VCPU … VCPU VCPU …IPI
IDC
vIRQ pIRQ
IPI
/4911
資工系網媒所 NEWS 實驗室
Architecture
2 scheduling algorithms (Non-Work Conserving)Simple Earliest Deadline First (SEDF)Credit
CPU Virtualization
Guest OS
VCPU VCPU
Guest OS
VCPU
…
…
PCPU PCPU PCPU …
App App
Hypervisor
Scheduling
/4912
資工系網媒所 NEWS 實驗室
Interrupt
Physical interruptFor the hypervisor or for guest OSes
Virtual interruptAsk guest OSes to do8 for now (max is 24)
PIC
IRQnDevice
OS
Hardware
PIC
IRQnDevice
Guest OS
Hardware
Hypervisor
Guest OS …
ISR
event
/4913
資工系網媒所 NEWS 實驗室
Two-level memoryThree-level memory
Virtual, Pseudo-physical, Machine
Memory Virtualization (1/2)
hypervisor
Application
OS
- Virtual Memory
-Physical Memory
Hypervisor
-Machine Memory
Guest OS-Pseudo-Physical Memory
P2M M2P
/4914
資工系網媒所 NEWS 實驗室
168M memory for hypervisor
Memory Virtualization (2/2)
Area Size
MPT, Machine-to-Physical Translation Table (RO) 16M
Page-Frame Information 96M
MPT, Machine-to-Physical Translation Table (R/W) 16M
Linear Page Table 8MShadow Linear Page Table 8MPer Domain Mappings 8M
Direct Map 12M
I/O Remap 4M
0xFFFFFFFF
0xFC0000000xFC400000
Heap
/4915
資工系網媒所 NEWS 實驗室
4 mechanisms to manipulate page tablesParavirtualized page tablesWrite page tables (Only level 1 is writable)Shadow page tablesHardware-assisted paging
Memory Virtualization- Translation
Virtual Memory
Machine Memory
Pseudo-Physical Memory
Page TablePage Fault ! Shadow Page Table
P2M
(VM->PFN) (VM->MFN or VM->P2M)
Second Level PagingHAP
MMU
/4916
資工系網媒所 NEWS 實驗室
Structure
Compare with start_info_page
Memory Virtualization - Shared Info Page
wall clock
event channel
Start Info Page Shared Info PageMapped by Domain Builder Guest OS
Information Static Dynamically Updated
MAX : 32 VCPUs
memory
TSC
/4917
資工系網媒所 NEWS 實驗室
I/O Device Virtualization
Hypervisor also provides three mechanisms to use devices.
Emulated Devices
Paravirtualized Driver
Pass-through
/4918
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - Emulated Devices
Implemented by QEMUe.g. sound card, ac97, sb16, etc
QEMU-DM
/4919
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - Paravirtualized Driver
Split Device Driver ModelAn example of sending packets
Front-End DriverBack-End Driver
Native Driver
/4920
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - I/O Ring
Without data, it only transfers request/replyAn example with GR
Grant Table
Active Grant Table
Hypervisor
Dom U Dom 0
GR GR
GR
Device
I/O Channel
/4921
資工系網媒所 NEWS 實驗室
I/O Device Virtualization - Pass-Through
Pass and directly use the device
Dom UDom 0
Hypervisor
Hardware
Virtual CPU VirtualMemory Scheduling
PhysicalCPU
PhysicalMemory
Eth1
…
…
NativeDriver
…NativeDriver
Eth0
Dom U
/4922
資工系網媒所 NEWS 實驗室
Hardware Virtual Machine
Intel Virtualization Technology
Technology Description Virtualization Implementation
VT-x Root/NonRootExtended Page Tables CPU, Memory Instructions Set
VT-i As VT-x, for ItaniumVT-d DMA, Interrupt Devices IOMMU (Chipset)VT-c Classify Packets Network Devices VMDq, VMDc
/4923
資工系網媒所 NEWS 實驗室
CPU Benchmark (1/2)
8.3%
Average over 100 tests, Deviation: 0.066~0.128%
/4924
資工系網媒所 NEWS 實驗室
CPU Benchmark (2/2)
5%
Calculate the 32M digits of .
/4925
資工系網媒所 NEWS 實驗室
Hard Disk Drive Benchmark
/4926
資工系網媒所 NEWS 實驗室
Network Benchmark (1/2)
Testing Time: 180 seconds, Deviation: 0.12~0.26%.
59%
/4927
資工系網媒所 NEWS 實驗室
Network Benchmark (2/2)
Sample Period: 2 seconds
Average: 9.82%
/4928
資工系網媒所 NEWS 實驗室
Architecture
Domain 1, X – A Fake Domain 0
BIOS
payload
hypervisor
Dom0
Linux
Dom1
Windows
DomU
Android …
non-assignable hardwareassignable hardwareVGA eth usb …
xendDrivers Drivers
/4929
資工系網媒所 NEWS 實驗室
KVM Architecture (1/2)KVM (Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V).
/4930
KVM
user space
kernel space
...MachineEmulator
Guest OS
Linux Kernel
MachineEmulator
Guest OSUserProcess
UserProcess
資工系網媒所 NEWS 實驗室
KVM Architecture (2/2)KVM consists of
A loadable kernel module, kvm.ko.Provides the core virtualization infrastructure.
A processor specific module, kvm-intel.ko or kvm-amd.ko.
Provides the support of hardware virtualization.
/4931
kvm.ko
kvm-intel.ko
kvm-amd.ko
KVM
create a device when loading kvm.ko
資工系網媒所 NEWS 實驗室
QEMU-KVM is modified from QEMU supporting KVM.
QEMU is a generic and open source machine emulator and virtualizer.
Machine Emulator for KVM
/4932
KVM API
Event LoopCPU Emulator
TranslationBuffer
ExceptionHandler
MemoryManagement
EmulatedDevices ...
User InterfaceQEMU-KVM
資工系網媒所 NEWS 實驗室
KVM APIThere are three types (implemented by ioctl):
On KVM device.KVM_CREATE_VMKVM_CHECK_EXTENSION…
On Virtual Machine (VM).KVM_CREATE_VCPUKVM_ASSIGN_PCI_DEVICE…
On Virtual CPU (VCPU).KVM_RUNKVM_GET_REGSKVM_GET_SREGS…
/4933
return VM idcreate VMs
create VCPUs
control VM
return VCPU id
control VCPU
資工系網媒所 NEWS 實驗室
Using KVM in QEMU-KVM
/4934
In cpus.c717 static void *qemu_kvm_cpu_thread_fn(void *arg)718 {...738 while (1) {739 r = kvm_cpu_exec(env);... ...745 qemu_kvm_wait_io_event(env);746 }749 }
In kvm-all.c954 int kvm_cpu_exec(CPUState *env)955 {...987 run_ret = kvm_vcpu_ioctl(env, KVM_RUN, 0);...1005 switch (run->exit_reason) {1006 case KVM_EXIT_IO:
In cpus.c924 static void qemu_kvm_start_vcpu(CPUState *env)925 {...929 qemu_thread_create(qemu_kvm_cpu_thread_fn, ...);...933 }
KVM
user space
kernel space
Guest OS
QEMU-KVM
…
…
threads
KVM_RUN
資工系網媒所 NEWS 實驗室
Compare to XenXen KVM
Hypervisor Type With Microkernel With Integrated Kernel(A Kernel Module)
Managing VM A Modified Linux(Domain-0)
The Integrated Kernel(Linux)
Guest OS 1. Paravirtualized2. HVM HVM
VCPU Scheduling 1. SEDF2. Credit/Credit2
As Linux Doese.g., CFS
ParavirtualizedDevice Split driver virtio
Management Tool
1. xl (Xen developed)2. libvirt (3rd party) libvirt (3rd party)
RelatedMechanism
More (XenStore, Grant Table, ...) Less
/4935
[1] Andrea Chierici, "A quantitative comparison between xen and kvm", Journal of Physics: Conference Series, IOP Publishing, vol. 219, no. 4, 2010.
In 2010, Andrea has following conclusions [1]: 1. KVM proved great stability and reliability. 2. Right now (2010), Xen hypervisor seems to be the best solution, particularly when using the paravirtualized approach.
資工系網媒所 NEWS 實驗室
Types of Virtualization
Hardware/platform virtualizationDesktop virtualizationSoftware virtualization
OS-level, Workspace, ApplicationStorage virtualization
E.g. Virtual Tape Library, 1.2B USD sold to CA, 1996.Data virtualizationDatabase virtualizationNetwork virtualization
/4936
資工系網媒所 NEWS 實驗室/4937
資工系網媒所 NEWS 實驗室/4938
資工系網媒所 NEWS 實驗室/4939
資工系網媒所 NEWS 實驗室/4940
資工系網媒所 NEWS 實驗室/4941
WeOS: emerge Our Services網民當家作主 , 共創資訊價值 !
台北
台南
......
...
嘉義...
台灣
京都
大阪
......
...
東京...
日本
Seattle
LA
.........
DC...
USA
BuyerSeller
LogisticsCash Flow...
Internet
Autonomous IDAutonomous Distributed Match Engine
V12N to help G11N (I18N + L10N).
資工系網媒所 NEWS 實驗室42
Computer Science and Information Engineering
資訊科學
資訊工程
資訊管理
資訊教育
生物資訊
醫學資訊
圖書資訊
金融資訊
資訊電子
資訊處理/49
資訊傳播
資工系網媒所 NEWS 實驗室
市值 2013/06/30
/4943
System Software
系微 1x (20.7 億台幣 )
HardwareApplication Software
宏達電 華碩 廣達 99x 99x 119x
台積電 1307x
鴻海 417x訊連 5x
聯發科 213x趨勢 61x
TI 560x
Google, Yahoo 4233x, 394x
IBM 3071x
Microsoft 4181x ARM 244x
Intel 1746x
Apple 5395x
Vmware 416x
Citrix 164x Adobe 332x
Semantec 227x
Amazon 1832x Cisco 1885x
資工系網媒所 NEWS 實驗室
市值 2013/10/08
/4944
System Software
系微 1x (20.1 億台幣 )
HardwareApplication Software
宏達電 華碩 廣達 57x 88x 124x
台積電 1342x
鴻海 492x訊連 4x
聯發科 255x趨勢 77x
TI 644x
IBM 2923x
Microsoft 4067x ARM 322x
Intel 1668x
Apple 6497x
Vmware 508x
Citrix 194x Adobe 372x
Semantec 254x
Amazon 2077x Cisco 1799x
Google, Yahoo 4227x, 511x
資工系網媒所 NEWS 實驗室
市值 2013/12/27
/4945
System Software
系微 1x (18.3 億台幣 )
HardwareApplication Software
宏達電 華碩 廣達 65x 108x 145x
台積電 1462x
鴻海 568x訊連 5x
聯發科 318x趨勢 77x
TI 784x
IBM 3309x
Microsoft 5138x ARM 420x
Intel 2100x
Apple 8341x
Vmware 629x
Citrix 188x Adobe 489x
Semantec 263x
Amazon 3043x Cisco 1916x
Google, Yahoo 6137x, 678x
資工系網媒所 NEWS 實驗室
How fast can virtualization achieve?95+% 99.9%
What kinds of applications?Well …
What problems it might incur?Technical
Big Data?Security
How much?BusinessPoliticsGlobalization (G11N) = Internationalization (I18N) + Localization (L10N)…
Answers for Big Questions
/4946
資工系網媒所 NEWS 實驗室
HomeworkRefer to
Xen To-Do List, http://wiki.xen.org/wiki/Xen_Document_Days/TODOBitCoin, http://bitcoin.org/zh_TW/WeOS
Each of you send a one-page report ( 學號 .pdf) to [email protected], answering any of the big or related questions with your words, what problems you would like to solve? And how?
Due on Dec 29.Your reports will be posted on course wiki on Dec 30.
/4947
資工系網媒所 NEWS 實驗室
假若真時真亦假虛擬實處實還虛
/4948
System typePlatform
Virtual Real
Test Data
Virtual simulation evaluation
Real emulation implementation
資工系網媒所 NEWS 實驗室
Stay hungry to be full [of passion].Stay foolish to be smart [on absorption].Virtualized reality vs. Real virtualization.
Life of Pi, trust yourself.專題 vs. PhD創意 vs. 創業 , 人事時地物本 e.g. 鼎王 1B, 麻油 1B, 鳳梨酥 20+B, 掏寶 , evernote, Line, ubitus, whoscall (6M0.5B), 阿里巴巴 , 萬達 , PTT?
Virtualized to go anywhere?Just Do it, NTU CSIE eSystem!
For Taiwan IndustryKey is system, System is key.
Summary
/4949
資工系網媒所 NEWS 實驗室
Reference
五分鐘看懂美國國債危機http://www.youtube.com/watch?v=K2hhck_kmz0
/4950