薛智文 [email protected] cwhsueh
DESCRIPTION
前瞻 資訊科技 (II) - 虛擬化 (1) - Virtualization( V12N ) . 薛智文 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 101 Spring, March 15, Fri 678, DTH 104. Preface. Steve Jobs (Apple, 1955-2011) Stay hunger, stay foolish. Dennis Ritchie (C language, 1941-2011) - PowerPoint PPT PresentationTRANSCRIPT
國立台灣大學資訊工程學系
http://www.csie.ntu.edu.tw/~cwhsueh/101 Spring, March 15, Fri 678, DTH 104
前瞻資訊科技(II)
- 虛擬化 (1)-
Virtualization(V12N)
資工系網媒所 NEWS 實驗室
Steve Jobs (Apple, 1955-2011)Stay hunger, stay foolish.
Dennis Ritchie (C language, 1941-2011) Skype eBay (4.1B USD, 2005) Microsoft (8.5B USD, 2011)Linux (Linus Torvalds, 1991)Android (Danger, 2003 Google, 2005, 34 eng.)Meego (Intel Samsung, Feb 2010 )Tizen (Intel Samsung [Nokia], Sep 2011)Windows 8 (Microsoft, nVidia 2011)IOS 5 (Apple, 2011)Firefox OS (Mozilla, 2012)MSN (Microsoft, 2013)Ubuntu Touch (Canonical, 2013)Android (85 eng., 2013), Andy Rubin (Today), Asus NB(200+ eng.)iPhone, iPad, HTC One, SamSung Galaxy S4 (Today)
Preface
/412
資工系網媒所 NEWS 實驗室
IntroductionWhat is virtualization? Cloud?Why is virtualization difficult?How to virtualize?
Case StudyMobile VirtualizationInline EmulationDomain 1
Q&A
Outline
/413
資工系網媒所 NEWS 實驗室
Virtual addressVirtual assistantVirtual classVirtual circuitVirtual communityVirtual Data CenterVirtual deviceVirtual diskVirtual hostVirtual keyboardVirtual machineVirtual marketVirtual memoryVirtual moneyVirtual Private NetworkVirtual reality…
What is Virtualization ?
Etc.
Virtualization
RunningApplications(x-platform)
Security
SharingHardwareResource
FullyUtilizing
Hardware
The creation of a virtual version of something.
/414
資工系網媒所 NEWS 實驗室
Virtual Assistant/SecretaryS/he screens my email. She checks my main email accounts, handles what she can, and “redirects” the messages that require my personal attention to my private account. She has reduced my email load by 90 percent.She books my travel. She handles all the details, including airline reservations, hotels, car rental, etc. She sets up a trip in TripIt, so I have everything I need in one place.She makes calls on my behalf. She makes appointments (both personal and professional), confirms my appointments, checks my voice mail, and follows up as needed. She manages my calendar. Almost nothing gets on my calendar unless it passes through her first. We have agreed together that I will only accept appointments on two afternoons a week, and she works to stay within those boundaries.She handles other projects as needed. I continue to turn over more and more to her. For example, she recently screened all the people who had applied to be a community leader on my site. She and my manager, Joy, ended up picking the final ten I appointed.File my files.
/415
資工系網媒所 NEWS 實驗室
Types of Virtualization
Hardware/platform virtualizationDesktop virtualizationSoftware virtualization
OS-level, Workspace, ApplicationStorage virtualizationData virtualizationDatabase virtualizationNetwork virtualization
/416
資工系網媒所 NEWS 實驗室
How fast can virtualization achieve?What kinds of applications can there be?What problems it might incur?
TechnicalSecurityBusinessPolitics…
Homework:Turn in a 3-5 page report answering any of the above or related questions, what problems you solve? How?
1-3 members per group, will be posted on course wiki.
Q&A in the last hour of class.
Big Questions for Virtualization
/417
資工系網媒所 NEWS 實驗室
假若真時真亦假
/418
System typePlatform
Virtual Real
Data
Virtual simulation evaluation
Real emulation implementation
資工系網媒所 NEWS 實驗室/419
資工系網媒所 NEWS 實驗室
Why Virtualization is Difficult?
OS is moved to ringr1/ring3On x86
Some instructionsSensitive InstructionsCannot be trapped
0/1/3 Ring, e.g. x86_32
0/3/3 Ring, e.g. x86_64, ARM
OS
OS
Critical Instructions
Instructions
Sensitive Register
Instructions
SGDT, SIDT, SLDTSMSWPUSHF(D), POPF(D)
Protected System
Instructions
LAR, LSL, VERR, VERWPUSH, POPCALL, JMP, INT, RETSTRMOV
/4110
資工系網媒所 NEWS 實驗室
Hardware
Hypervisor, e.g. Xen
VM0 VM1 VMN…
Virtual Machine Monitor (VMM)Hypervisor
Hardware
Hosted VMM, e.g. VMware
VM0 VM1 VMN…
Host Operating System
Type I - Hypervisor Type II – Hosted VMM
VM : Virtual Machine, Guest OS + Virtual Devices
/4111
資工系網媒所 NEWS 實驗室
Software Execution Modes in Virtualization Environment
Mode Physical mode Virtual modeDescription
Hypervisor Privileged N/A
For executing the hypervisor only.
Kernel User Privileged
For executing the kernel of a virtual machine.
User User User
For executing user processes of a guest OS.
/4112
資工系網媒所 NEWS 實驗室
According to Popek and Goldberg† in 1974,Virtual machines can be constructed for a platform if
Sensitive Instructionsmight change the state of system resources
Privileged Instructionsmust be executed with sufficient privilege
The First Challenge of VirtualizationVirtualizable
Sensitive Instructions⊆Privileged Instructions
† G. J. Popek and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures,” Commun. ACM, vol. 17, no. 7, pp. 412–421, Jul. 1974.
/4113
資工系網媒所 NEWS 實驗室
Binary translation Hypercall
How to Virtualize ?
Full Virtualizat
ion
Para Virtualiza
tion
Hardware Assisted Virtualization
Intel VT-x & AMD SVMTrap and emulate
/4114
資工系網媒所 NEWS 實驗室
Case Study
1. Mobile Virtualization2. Inline Emulation†
3. Domain 1• with Insyde Inc.
/4115
† Yuan-Cheng Lee, Chih-Wen Hsueh, and Rong-Guey Chang, "Inline Emulation: An Optimization Technique for Virtualization on Embedded Systems," Proc. of the 17th International Conference on Real-Time and Embedded Computing Systems and Applications (RTCSA'11), Toyama, Japan, August 2011.
資工系網媒所 NEWS 實驗室/4116
Mobile Virtualization90+% performance on PC embedded multicore systemsTo run multiple OSes on a mobile phone…
iPhone+Android is possible!
Break the limitation of OSes!
資工系網媒所 NEWS 實驗室
Main Memory
Hardware Assisted Paging
Primary MMU Extended MMU
Page Table Extended Page Table
VA GPA MPA
VA: virtual addressGPA: guest physical addressMPA: machine physical addressMMU: memory management unit
/4117
資工系網媒所 NEWS 實驗室
Main Memory
0-miss Page Translation
Page Table
VA GPA MPA
bTLBPrimary MMUVA: virtual addressGPA: guest physical addressMPA: machine physical addressMMU: memory management unitbTLB: buddy TLB
†Yuan-Cheng Lee, Chih-Wen Hsueh, "An Optimized Page Translation for Mobile Virtualization," to appear in Proc. of 50th Design Automation Conference (DAC), Austin, TX, USA, June 2013. (Top conference)
/4118
50+% speedup
資工系網媒所 NEWS 實驗室
The First Challenge of VirtualizationIdea of Inline EmulationDesign of Inline EmulationEvaluation and AnalysisConclusions
Inline Emulation
/4119
資工系網媒所 NEWS 實驗室
Related Work
Secure Xen on ARM (Samsung)
It proved virtualization is possible for ARM platform.
The PENAR project (University of Applied Sciences, Western Switzerland)
It integrated the source trees of Xen, RTLinux, and Linux for ARM.
OKL4 (Open Kernel Labs)
A hypervisor which adopts microkernel architecture for embedded systems
/4120
資工系網媒所 NEWS 實驗室
Issues on Virtualization for ARM
The most critical issue is:
ExampleMOVS PC, LR // move the value in link register to PC
It will cause unpredictable behavior when executed in user mode.SPSR: Saved Program Status RegisterCPSR: Current Program Status Register
Sensitive instructions Privileged instructions
/4121
資工系網媒所 NEWS 實驗室
The Problematic Instructions (1/3)
Type IInstructions which executed in user mode will cause undefined instruction (UDI) exception
We call them Canonical Privileged Instructions.
ExampleMCR p15, 0, r0, c2, c0, 0
Move r0 to c2 and c0 in coprocessor specified by p15 for operation according to option 0 and 0
Operand-dependent operation
/4122
資工系網媒所 NEWS 實驗室
The Problematic Instructions (2/3)
Type IIInstructions which executed in user mode will have no effect
ExampleMSR cpsr_c, #0xD3
Switch to privileged mode and disable interrupt
N Z C V Q -- J -- GE[3:0] -- E A I F T M[4:0]
31 0
ExecutionFlags
ExceptionMask
ExecutionMode
Program Status Register (PSR)
/4123
資工系網媒所 NEWS 實驗室
The Problematic Instructions (3/3)
Type IIIInstructions which executed in user mode will cause unpredictable behaviors
ExampleMOVS PC, LR
/4124
資工系網媒所 NEWS 實驗室
Solutions
Complexity Binary translation Hypercall Inline
emulation
Design High Low Low
Implementation Medium High Low
Runtime High Medium Low
Counterpart(in programming languages)
Virtual function Normal function Inline function
/4125
資工系網媒所 NEWS 實驗室
For the ARM architecture, the instruction (TYPE III)MOVS PC, LR
Changes the program counter and switches to user mode.However, it causes unpredictable behavior when executed in user mode.Therefore, it is a sensitive instruction but not a privileged instruction.
The First Challenge of VirtualizationExample
Sensitive instructions Privileged instructions
/4126
資工系網媒所 NEWS 實驗室
Dynamic Binary Translation
The First Challenge of VirtualizationSolutions (1/2)
BL TLB_FLUSH_DENTRY…
TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR
…
BL TLB_FLUSH_DENTRY_NEW…
TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR
…TLB_FLUSH_DENTRY_NEW: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB
Translation Basic Block
/4127
資工系網媒所 NEWS 實驗室
Virtualization APIs – hypercalls
The First Challenge of VirtualizationSolutions (2/2)
BL TLB_FLUSH_DENTRY…
TLB_FLUSH_DENTRY: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB
…
Restore User Context & PC
SWI Handler
Hypercall Handler
……
LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1
/* In Hypervisor */
/* In Guest OS */
/4128
資工系網媒所 NEWS 實驗室
HypercallGuest OS
Hypervisor
SWI Handler
Hypercalls
Softw
are
Inte
rrupt
Hyper Call Handler
reschedule?
NoYes
context switch
/4129
資工系網媒所 NEWS 實驗室
Idea of Inline EmulationThe Original Instruction
Hypercall
MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1
MOV R0, #CMD_FLUSH_DENTRYMOV R1, VIRT_ADDRSWI #HYPER_CALL_TLB
LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1
Restore User Context & PC
Hypercall Handler
……
Guest OS
Inline Emulation
Restore PC
Inline Emulation Handler
……
Guest OS
MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1
/* restore user context */LDMIA SP, [R0 – R14]MCR p15, 0, R0, C8, C6, 1
MCR p15, 0, R0, C8, C6, 1
/4130
資工系網媒所 NEWS 實驗室
Inline EmulationGuest OS
Hypervisor
SWI Handler
Inline Emulation
CanonicalPrivileged
Instructions(TYPE I)
UDI E
xcep
tion
retu
rn to
gue
st
Hypercalls
Softw
are
Inte
rrupt
Hyper Call Handler
reschedule?
NoYes
context switch
UDI Handler
/4131
資工系網媒所 NEWS 實驗室
Design of Inline EmulationThe Main Handler
A handler for the instruction is found
No handler for the instruction was found
/4132
資工系網媒所 NEWS 實驗室
The Issue of Finding an Inline Emulation Handler
It is hard to find a simple hash function.
Because the encoding of ARM instructions is complicated.
Instead, we can construct an efficient search table.
Because there are a few frequently used instructions.
Instruction Ratio (%)
mcr p15, 0, Rd, c3, c0, 0 58.44
mcr p15, 0, Rd, c7, c14, 1 39.73
mcr p15, 0, Rd, c8, c5, 1 0.49
mcr p15, 0, Rd, c8, c6, 1 0.49
mcr p15, 0, Rd, c7, c10, 4 0.24
mcr p15, 0, Rd, c2, c0, 0 0.23
mcr p15, 0, Rd, c7, c5, 0 0.11
mcr p15, 0, Rd, c8, c5, 0 0.08
mcr p15, 0, Rd, c8, c6, 0 0.08
mrc p15, 0, Rd, c7, c14, 3 0.11
Others <0.01
/4133
資工系網媒所 NEWS 實驗室
Example of Mto1 Search Table
Encoding of MCR instructionSyntax: MCR{cond} cp, op1, Rd, CRn, CRm, op2
mask value handler Set0x0F1F0F10 0x0E130F10 handler_CR3 MCR 15, op1, Rd, c3, CRm, op20x0F1C0F10 0x0E100F10 handler_CR02 MCR 15, op1, Rd, {c0 - c2}, CRm, op20x0F100F10 0x0E100F10 handler_CRX MCR 15, op1, Rd, {c4 - c15}, CRm, op2……0x00000000 0x00000000 0x00000000 End of Table
cond 1110 op1 0 CRn Rd cp op2 1 CRm
31 0
* An entry E is matched if /4134
資工系網媒所 NEWS 實驗室
Design of Inline EmulationDynamic Inline Emulation (DIE) Handler
Self-modifying
inlining the instruction
flushing caches
/4135
資工系網媒所 NEWS 實驗室
Design of Inline EmulationStatic Inline Emulation (SIE) Handler
/* data synchronization barrier */executing the hard-coded instructions
restoring user context & PC
/4136
資工系網媒所 NEWS 實驗室
Emulator Android emulator (ARMv5)
Memory 12MB for the hypervisor
32MB for the guest OS
Hypervisor Xen 4.0.1 for ARMv5
Guest OS Linux 2.6.29-Goldfish
Compilation Using GCC with debug (-g) flag
Evaluation and AnalysisThe Experiment Environment
/4137
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Distribution of Emulated Instructions
Instruction CRn, CRm, op2 Ratio(%)MCR c3, c0, 0 58.44
c7, c14, 1 39.73
c8, c5, 1 0.49
c8, c6, 1 0.49
c7, c10, 4 0.24
c2, c0, 0 0.23
c7, c5, 0 0.11
c8, c5, 0 0.08
c8, c6, 0 0.08MRC c7, c14, 3 0.11Others <0.01
More than 98%
/4138
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Micro-Level Analysis (1/2)
Operation - Invalidating TLB
Mode (instructions) ImprovementPV/IE (%)USER UND SWI Total
A single entry (DIE handler)
PV 13.00 0.00 305.97 318.97613.39
IE 3.00 49.00 0.00 52.00
The entire TLB(SIE handler)
PV 11.00 0.00 305.80 316.80704.01
IE 3.00 42.00 0.00 45.00
/4139
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Micro-Level Analysis (2/2)
InstructionMode (instructions) Improvement
PV/IE (%)USER UND SWI Total
MCR p15, 0, Rd, c3, c0, 0(DIE handler)
PV 9.00 0.00 203.29 212.29424.57
IE 3.00 47.00 0.00 50.00
MCR p15, 0, Rd, c7, c14, 1(DIE handler)
PV 13.00 0.00 304.50 317.50566.94
IE 3.00 53.00 0.00 56.00
Inline emulation can achieve at least 4.24X performance of hypercalls in
most cases (about 98%)./4140
資工系網媒所 NEWS 實驗室
Evaluation and AnalysisThe Macro-Level Analysis
DataProcessing
DataTransfer
Branch SoftwareInterrupt
Coprocessorand Other
Total
Paravirtualization(instructions)
89.22M 91.28M 27.08M 48560 4.79M 212.42M
Inline Emulation(instructions)
89.04M 90.66M 26.93M 33658 4.93M 211.59M
(PV – IE) / PV (%) 0.20 0.68 0.53 30.69 -2.72 0.39
/4141
資工系網媒所 NEWS 實驗室
Inline emulation :Reduces the efforts to port guest operating systemsIncreases the handling of sensitive instructions (4-7x)Increases the overall system performance (0.39%)
Future workOptimization for memory virtualization
Much higher the overall speedup is possible.
Conclusions
/4142