Download - 薛智文 [email protected] csie.ntu.tw/~cwhsueh
國立台灣大學資訊工程學系
http://www.csie.ntu.edu.tw/~cwhsueh/100 Fall, Oct 28, Fri 678, DTH 104
前瞻資訊科技- 虛擬化 (1)
-Virtualization(V1
2N)
資工系網媒所 NEWS 實驗室
Steve Jobs (Apple, 1955-2011)Stay hunger, stay foolish. ( 求知若渴,虛心若愚。)
Dennis Ritchie (C language, 1941-2011)
Skype eBay (4.1B USD, 2005) Microsoft (8.5B USD, 2011)
Linux (Linus Torvalds, 1991)
Android (Danger, 2003 Google, 2005)
Meego (Intel Samsung, Feb 2010 )
Tizen (Intel Samsung [Nokia], Sep 2011)
Windows 8 (Microsoft, nVidia 2011)
IOS 5 (Apple, 2011)
廣達,台積電 (2011)
Preface
持飢保愚
/372
資工系網媒所 NEWS 實驗室
IntroductionWhat is virtualization?
Why is virtualization difficult?
How to virtualize?
Case StudyInline Emulation
Domain 1
Q&A
Outline
/373
資工系網媒所 NEWS 實驗室
Virtual class
Virtual circuit
Virtual community
Virtual device
Virtual disk
Virtual host
Virtual keyboard
Virtual machine
Virtual market
Virtual memory
Virtual money
Virtual Private Network
Virtual reality
…
What is Virtualization ?
Etc.Etc.
VirtualizationVirtualization
RunningApplications(x-platform)
RunningApplications(x-platform)
SecuritySecurity
SharingHardwareResource
SharingHardwareResource
FullyUtilizing
Hardware
FullyUtilizing
Hardware
The creation of a virtual version of something.
/374
資工系網媒所 NEWS 實驗室
Types of Virtualization
Hardware/platform virtualization
Desktop virtualization
Software virtualizationOS-level, Workspace, Application
Storage virtualization
Data virtualization
Database virtualization
Network virtualization
/375
資工系網媒所 NEWS 實驗室/376
資工系網媒所 NEWS 實驗室
How fast can virtualization achieve?
What kinds of applications can there be?
What problems it might incur?Technical
Security
Business
Politics
…Homework:
Send to TA a 3-5 page report answering any of the above or related questions.1-3 members per group, will be posted on course wiki.
A 5-minute talk/Q&A in the last hour of class.
Big Questions for Virtualization
/377
資工系網媒所 NEWS 實驗室
Why Virtualization is Difficult?
OS is moved to ringr1/ring3
On x86
Some instructionsSensitive Instructions
Cannot be trapped
0/1/3 Ring, e.g. x86_32
0/3/3 Ring, e.g. x86_64, ARM
OS
OS
Critical Instructions
Instructions
Sensitive Register
Instructions
SGDT, SIDT, SLDT
SMSW
PUSHF(D), POPF(D)
Protected System
Instructions
LAR, LSL, VERR, VERW
PUSH, POP
CALL, JMP, INT, RET
STR
MOV
/378
資工系網媒所 NEWS 實驗室
Hardware
Hypervisor, e.g. Xen
VM0 VM1 VMN…
Virtual Machine Monitor (VMM)Hypervisor
Hardware
Hosted VMM, e.g. VMware
VM0 VM1 VMN…
Host Operating System
Type I - Hypervisor Type II – Hosted VMM
VM : Virtual Machine, Guest OS + Virtual Devices
/379
資工系網媒所 NEWS 實驗室
Software Execution Modes in Virtualization Environment
Mode Physical mode Virtual mode
Description
Hypervisor Privileged N/A
For executing the hypervisor only.
Kernel User Privileged
For executing the kernel of a virtual machine.
User User User
For executing user processes of a guest OS.
/3710
資工系網媒所 NEWS 實驗室
According to Popek and Goldberg† in 1974,Virtual machines can be constructed for a platform if
Sensitive Instructionsmight change the state of system resources
Privileged Instructionsmust be executed with sufficient privilege
The First Challenge of VirtualizationVirtualizable
Sensitive Instructions⊆Privileged Instructions
† G. J. Popek and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures,” Commun. ACM, vol. 17, no. 7, pp. 412–421, Jul. 1974.
/3711
資工系網媒所 NEWS 實驗室
Binary translation Hypercall
How to Virtualize ?
Full Virtualizat
ion
Para Virtualiza
tion
Hardware Assisted Virtualization
Intel VT-x & AMD SVMTrap and emulate
/3712
資工系網媒所 NEWS 實驗室
Case Study
1. Inline Emulation†
2. Domain 1• with Insyde Inc.
/3713
† Yuan-Cheng Lee, Chih-Wen Hsueh, and Rong-Guey Chang, "Inline Emulation: An Optimization Technique for Virtualization on Embedded Systems," Proc. of the 17th International Conference on Real-Time and Embedded Computing Systems and Applications (RTCSA'11), Toyama, Japan, August 2011.
資工系網媒所 NEWS 實驗室
Motivation
The First Challenge of Virtualization
Idea of Inline Emulation
Design of Inline Emulation
Evaluation and Analysis
Conclusions
Inline Emulation
/3714
資工系網媒所 NEWS 實驗室
Virtualization is fast enough on PC with 90+% performance compared to the same non-virtualized OS.
We can further utilize multi-core embedded processors
To run multiple operating systems on a mobile phone…
Motivation
/3715
資工系網媒所 NEWS 實驗室
Related Work
Secure Xen on ARM (Samsung)
It proved virtualization is possible for ARM platform.
The PENAR project (University of Applied Sciences, Western Switzerland)
It integrated the source trees of Xen, RTLinux, and Linux for ARM.
OKL4 (Open Kernel Labs)
A hypervisor which adopts microkernel architecture for embedded systems
/3716
資工系網媒所 NEWS 實驗室
Issues on Virtualization for ARM
The most critical issue is:
ExampleMOVS PC, LR // move the value in link register to PC
It will cause unpredictable behavior when executed in user mode.SPSR: Saved Program Status RegisterCPSR: Current Program Status Register
Sensitive instructions Privileged instructions
/3717
資工系網媒所 NEWS 實驗室
The Problematic Instructions (1/3)
Type IInstructions which executed in user mode will cause undefined instruction (UDI) exception
We call them Canonical Privileged Instructions.
ExampleMCR p15, 0, r0, c2, c0, 0
Move r0 to c2 and c0 in coprocessor specified by p15 for operation according to option 0 and 0
Operand-dependent operation
/3718
資工系網媒所 NEWS 實驗室
The Problematic Instructions (2/3)
Type IIInstructions which executed in user mode will have no effect
ExampleMSR cpsr_c, #0xD3
Switch to privileged mode and disable interrupt
N Z C V Q -- J -- GE[3:0] -- E A I F T M[4:0]
31 0
ExecutionFlags
ExceptionMask
ExecutionMode
Program Status Register (PSR)
/3719
資工系網媒所 NEWS 實驗室
The Problematic Instructions (3/3)
Type IIIInstructions which executed in user mode will cause unpredictable behaviors
ExampleMOVS PC, LR
/3720
資工系網媒所 NEWS 實驗室
Solutions
ComplexityBinary translation
HypercallInline emulation
Design High Low Low
Implementation Medium High Low
Runtime High Medium Low
Counterpart(in programming languages)
Virtual function Normal function Inline function
/3721
資工系網媒所 NEWS 實驗室
For the ARM architecture, the instruction (TYPE III)
MOVS PC, LRChanges the program counter and switches to user mode.
However, it causes unpredictable behavior when executed in user mode.
Therefore, it is a sensitive instruction but not a privileged instruction.
The First Challenge of VirtualizationExample
Sensitive instructions Privileged instructions
/3722
資工系網媒所 NEWS 實驗室
Dynamic Binary Translation
The First Challenge of VirtualizationSolutions (1/2)
BL TLB_FLUSH_DENTRY…
TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR
…
BL TLB_FLUSH_DENTRY_NEW…
TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR
…TLB_FLUSH_DENTRY_NEW: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB
Translation Basic Block
/3723
資工系網媒所 NEWS 實驗室
Virtualization APIs – hypercalls
The First Challenge of VirtualizationSolutions (2/2)
BL TLB_FLUSH_DENTRY…
TLB_FLUSH_DENTRY: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB
…
Restore User Context & PC
SWI Handler
Hypercall Handler
……
LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1
/* In Hypervisor */
/* In Guest OS */
/3724
資工系網媒所 NEWS 實驗室
Hypercall
Guest OS
Hypervisor
SWI Handler
Hypercalls
Soft
ware
In
terr
up
t
Hyper Call Handler
reschedule?
NoYes
context switch
/3725
資工系網媒所 NEWS 實驗室
Idea of Inline EmulationThe Original Instruction
Hypercall
MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1
MOV R0, #CMD_FLUSH_DENTRYMOV R1, VIRT_ADDRSWI #HYPER_CALL_TLB
LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1
Restore User Context & PC
Hypercall Handler
……
Guest OS
Inline Emulation
Restore PC
Inline Emulation Handler
……
Guest OS
MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1
/* restore user context */LDMIA SP, [R0 – R14]MCR p15, 0, R0, C8, C6, 1
MCR p15, 0, R0, C8, C6, 1
/3726
資工系網媒所 NEWS 實驗室
Inline EmulationGuest OS
Hypervisor
SWI Handler
Inline Emulation
CanonicalPrivileged
Instructions(TYPE I)
UD
I E
xcep
tion
retu
rn t
o g
uest
Hypercalls
Soft
ware
Inte
rru
pt
Hyper Call Handler
reschedule?
No
Yescontext switch
UDI Handler
/3727
資工系網媒所 NEWS 實驗室
Design of Inline EmulationThe Main Handler
A handler for the instruction is found
No handler for the instruction was found
/3728
資工系網媒所 NEWS 實驗室
The Issue of Finding an Inline Emulation Handler
It is hard to find a simple hash function.
Because the encoding of ARM instructions is complicated.
Instead, we can construct an efficient search table.
Because there are a few frequently used instructions.
Instruction Ratio (%)
mcr p15, 0, Rd, c3, c0, 0 58.44
mcr p15, 0, Rd, c7, c14, 1 39.73
mcr p15, 0, Rd, c8, c5, 1 0.49
mcr p15, 0, Rd, c8, c6, 1 0.49
mcr p15, 0, Rd, c7, c10, 4 0.24
mcr p15, 0, Rd, c2, c0, 0 0.23
mcr p15, 0, Rd, c7, c5, 0 0.11
mcr p15, 0, Rd, c8, c5, 0 0.08
mcr p15, 0, Rd, c8, c6, 0 0.08
mrc p15, 0, Rd, c7, c14, 3 0.11
Others <0.01
/3729
資工系網媒所 NEWS 實驗室
Example of Mto1 Search Table
Encoding of MCR instructionSyntax: MCR{cond} cp, op1, Rd, CRn, CRm, op2
mask value handler Set
0x0F1F0F10 0x0E130F10 handler_CR3 MCR 15, op1, Rd, c3, CRm, op2
0x0F1C0F10 0x0E100F10 handler_CR02 MCR 15, op1, Rd, {c0 - c2}, CRm, op2
0x0F100F10 0x0E100F10 handler_CRX MCR 15, op1, Rd, {c4 - c15}, CRm, op2
……
0x00000000 0x00000000 0x00000000 End of Table
cond 1110 op1 0 CRn Rd cp op2 1 CRm
31 0
* An entry E is matched if
/3730
資工系網媒所 NEWS 實驗室
Design of Inline EmulationDynamic Inline Emulation (DIE) Handler
Self-modifying
inlining the instruction
flushing caches
/3731
資工系網媒所 NEWS 實驗室
Design of Inline EmulationStatic Inline Emulation (SIE) Handler
/* data synchronization barrier */executing the hard-coded instructions
restoring user context & PC
/3732
資工系網媒所 NEWS 實驗室
Emulator Android emulator (ARMv5)
Memory 12MB for the hypervisor
32MB for the guest OS
Hypervisor Xen 4.0.1 for ARMv5
Guest OS Linux 2.6.29-Goldfish
Compilation Using GCC with debug (-g) flag
Evaluation and AnalysisThe Experiment Environment
/3733
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Distribution of Emulated Instructions
Instruction CRn, CRm, op2 Ratio(%)
MCR c3, c0, 0 58.44
c7, c14, 1 39.73
c8, c5, 1 0.49
c8, c6, 1 0.49
c7, c10, 4 0.24
c2, c0, 0 0.23
c7, c5, 0 0.11
c8, c5, 0 0.08
c8, c6, 0 0.08
MRC c7, c14, 3 0.11
Others <0.01
More than 98%
/3734
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Micro-Level Analysis (1/2)
Operation - Invalidating TLB
Mode (instructions) ImprovementPV/IE (%)USER UND SWI Total
A single entry (DIE handler)
PV 13.00 0.00 305.97 318.97613.39
IE 3.00 49.00 0.00 52.00
The entire TLB(SIE handler)
PV 11.00 0.00 305.80 316.80704.01
IE 3.00 42.00 0.00 45.00
/3735
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Micro-Level Analysis (2/2)
InstructionMode (instructions) Improvement
PV/IE (%)USER UND SWI Total
MCR p15, 0, Rd, c3, c0, 0(DIE handler)
PV 9.00 0.00 203.29 212.29424.57
IE 3.00 47.00 0.00 50.00
MCR p15, 0, Rd, c7, c14, 1(DIE handler)
PV 13.00 0.00 304.50 317.50566.94
IE 3.00 53.00 0.00 56.00
Inline emulation can achieve at least 4.24X performance of hypercalls in
most cases (about 98%)./3736
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Macro-Level Analysis
DataProcessing
DataTransfer
Branch SoftwareInterrupt
Coprocessorand Other
Total
Paravirtualization(instructions)
89.22M 91.28M 27.08M 48560 4.79M 212.42M
Inline Emulation(instructions)
89.04M 90.66M 26.93M 33658 4.93M 211.59M
(PV – IE) / PV (%) 0.20 0.68 0.53 30.69 -2.72 0.39
/3737
資工系網媒所 NEWS 實驗室
Inline emulation :Reduces the efforts to port guest operating systems
Increases the handling of sensitive instructions (4-7x)
Increases the overall system performance (0.39%)
Future workOptimization for memory virtualization
Much higher the overall speedup is possible.
Conclusions
/3738