薛智文 [email protected] cwhsueh

42
國國國國國國 國國國國國國 薛薛薛 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 101 Spring, March 15, Fri 678, DTH 104 國國國國國國 (II) - 國國國 (1) - Virtualization(V 12N)

Upload: qabil

Post on 25-Feb-2016

54 views

Category:

Documents


0 download

DESCRIPTION

前瞻 資訊科技 (II) - 虛擬化 (1) - Virtualization( V12N ) . 薛智文 [email protected] http://www.csie.ntu.edu.tw/~cwhsueh/ 101 Spring, March 15, Fri 678, DTH 104. Preface. Steve Jobs (Apple, 1955-2011) Stay hunger, stay foolish. Dennis Ritchie (C language, 1941-2011) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

國立台灣大學資訊工程學系

薛智文[email protected]

http://www.csie.ntu.edu.tw/~cwhsueh/101 Spring, March 15, Fri 678, DTH 104

前瞻資訊科技(II)

- 虛擬化 (1)-

Virtualization(V12N)

Page 2: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Steve Jobs (Apple, 1955-2011)Stay hunger, stay foolish.

Dennis Ritchie (C language, 1941-2011) Skype eBay (4.1B USD, 2005) Microsoft (8.5B USD, 2011)Linux (Linus Torvalds, 1991)Android (Danger, 2003 Google, 2005, 34 eng.)Meego (Intel Samsung, Feb 2010 )Tizen (Intel Samsung [Nokia], Sep 2011)Windows 8 (Microsoft, nVidia 2011)IOS 5 (Apple, 2011)Firefox OS (Mozilla, 2012)MSN (Microsoft, 2013)Ubuntu Touch (Canonical, 2013)Android (85 eng., 2013), Andy Rubin (Today), Asus NB(200+ eng.)iPhone, iPad, HTC One, SamSung Galaxy S4 (Today)

Preface

/412

Page 3: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

IntroductionWhat is virtualization? Cloud?Why is virtualization difficult?How to virtualize?

Case StudyMobile VirtualizationInline EmulationDomain 1

Q&A

Outline

/413

Page 4: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Virtual addressVirtual assistantVirtual classVirtual circuitVirtual communityVirtual Data CenterVirtual deviceVirtual diskVirtual hostVirtual keyboardVirtual machineVirtual marketVirtual memoryVirtual moneyVirtual Private NetworkVirtual reality…

What is Virtualization ?

Etc.

Virtualization

RunningApplications(x-platform)

Security

SharingHardwareResource

FullyUtilizing

Hardware

The creation of a virtual version of something.

/414

Page 5: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Virtual Assistant/SecretaryS/he screens my email. She checks my main email accounts, handles what she can, and “redirects” the messages that require my personal attention to my private account. She has reduced my email load by 90 percent.She books my travel. She handles all the details, including airline reservations, hotels, car rental, etc. She sets up a trip in TripIt, so I have everything I need in one place.She makes calls on my behalf. She makes appointments (both personal and professional), confirms my appointments, checks my voice mail, and follows up as needed. She manages my calendar. Almost nothing gets on my calendar unless it passes through her first. We have agreed together that I will only accept appointments on two afternoons a week, and she works to stay within those boundaries.She handles other projects as needed. I continue to turn over more and more to her. For example, she recently screened all the people who had applied to be a community leader on my site. She and my manager, Joy, ended up picking the final ten I appointed.File my files.

/415

Page 6: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Types of Virtualization

Hardware/platform virtualizationDesktop virtualizationSoftware virtualization

OS-level, Workspace, ApplicationStorage virtualizationData virtualizationDatabase virtualizationNetwork virtualization

/416

Page 7: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

How fast can virtualization achieve?What kinds of applications can there be?What problems it might incur?

TechnicalSecurityBusinessPolitics…

Homework:Turn in a 3-5 page report answering any of the above or related questions, what problems you solve? How?

1-3 members per group, will be posted on course wiki.

Q&A in the last hour of class.

Big Questions for Virtualization

/417

Page 8: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

假若真時真亦假

/418

System typePlatform

Virtual Real

Data

Virtual simulation evaluation

Real emulation implementation

Page 9: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室/419

Page 10: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Why Virtualization is Difficult?

OS is moved to ringr1/ring3On x86

Some instructionsSensitive InstructionsCannot be trapped

0/1/3 Ring, e.g. x86_32

0/3/3 Ring, e.g. x86_64, ARM

OS

OS

Critical Instructions

Instructions

Sensitive Register

Instructions

SGDT, SIDT, SLDTSMSWPUSHF(D), POPF(D)

Protected System

Instructions

LAR, LSL, VERR, VERWPUSH, POPCALL, JMP, INT, RETSTRMOV

/4110

Page 11: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Hardware

Hypervisor, e.g. Xen

VM0 VM1 VMN…

Virtual Machine Monitor (VMM)Hypervisor

Hardware

Hosted VMM, e.g. VMware

VM0 VM1 VMN…

Host Operating System

Type I - Hypervisor Type II – Hosted VMM

VM : Virtual Machine, Guest OS + Virtual Devices

/4111

Page 12: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Software Execution Modes in Virtualization Environment

Mode Physical mode Virtual modeDescription

Hypervisor Privileged N/A

For executing the hypervisor only.

Kernel User Privileged

For executing the kernel of a virtual machine.

User User User

For executing user processes of a guest OS.

/4112

Page 13: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

According to Popek and Goldberg† in 1974,Virtual machines can be constructed for a platform if

Sensitive Instructionsmight change the state of system resources

Privileged Instructionsmust be executed with sufficient privilege

The First Challenge of VirtualizationVirtualizable

Sensitive   Instructions⊆Privileged   Instructions

† G. J. Popek and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures,” Commun. ACM, vol. 17, no. 7, pp. 412–421, Jul. 1974.

/4113

Page 14: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Binary translation Hypercall

How to Virtualize ?

Full Virtualizat

ion

Para Virtualiza

tion

Hardware Assisted Virtualization

Intel VT-x & AMD SVMTrap and emulate

/4114

Page 15: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Case Study

1. Mobile Virtualization2. Inline Emulation†

3. Domain 1• with Insyde Inc.

/4115

† Yuan-Cheng Lee, Chih-Wen Hsueh, and Rong-Guey Chang, "Inline Emulation: An Optimization Technique for Virtualization on Embedded Systems," Proc. of the 17th International Conference on Real-Time and Embedded Computing Systems and Applications (RTCSA'11), Toyama, Japan, August 2011.

Page 16: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室/4116

Mobile Virtualization90+% performance on PC embedded multicore systemsTo run multiple OSes on a mobile phone…

iPhone+Android is possible!

Break the limitation of OSes!

Page 17: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Main Memory

Hardware Assisted Paging

Primary MMU Extended MMU

Page Table Extended Page Table

VA GPA MPA

VA: virtual addressGPA: guest physical addressMPA: machine physical addressMMU: memory management unit

/4117

Page 18: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Main Memory

0-miss Page Translation

Page Table

VA GPA MPA

bTLBPrimary MMUVA: virtual addressGPA: guest physical addressMPA: machine physical addressMMU: memory management unitbTLB: buddy TLB

†Yuan-Cheng Lee, Chih-Wen Hsueh, "An Optimized Page Translation for Mobile Virtualization," to appear in Proc. of 50th Design Automation Conference (DAC),  Austin, TX, USA, June 2013. (Top conference)

/4118

50+% speedup

Page 19: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

The First Challenge of VirtualizationIdea of Inline EmulationDesign of Inline EmulationEvaluation and AnalysisConclusions

Inline Emulation

/4119

Page 20: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Related Work

Secure Xen on ARM (Samsung)

It proved virtualization is possible for ARM platform.

The PENAR project (University of Applied Sciences, Western Switzerland)

It integrated the source trees of Xen, RTLinux, and Linux for ARM.

OKL4 (Open Kernel Labs)

A hypervisor which adopts microkernel architecture for embedded systems

/4120

Page 21: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Issues on Virtualization for ARM

The most critical issue is:

ExampleMOVS PC, LR // move the value in link register to PC

It will cause unpredictable behavior when executed in user mode.SPSR: Saved Program Status RegisterCPSR: Current Program Status Register

Sensitive instructions Privileged instructions

/4121

Page 22: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

The Problematic Instructions (1/3)

Type IInstructions which executed in user mode will cause undefined instruction (UDI) exception

We call them Canonical Privileged Instructions.

ExampleMCR p15, 0, r0, c2, c0, 0

Move r0 to c2 and c0 in coprocessor specified by p15 for operation according to option 0 and 0

Operand-dependent operation

/4122

Page 23: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

The Problematic Instructions (2/3)

Type IIInstructions which executed in user mode will have no effect

ExampleMSR cpsr_c, #0xD3

Switch to privileged mode and disable interrupt

N Z C V Q -- J -- GE[3:0] -- E A I F T M[4:0]

31 0

ExecutionFlags

ExceptionMask

ExecutionMode

Program Status Register (PSR)

/4123

Page 24: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

The Problematic Instructions (3/3)

Type IIIInstructions which executed in user mode will cause unpredictable behaviors

ExampleMOVS PC, LR

/4124

Page 25: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Solutions

Complexity Binary translation Hypercall Inline

emulation

Design High Low Low

Implementation Medium High Low

Runtime High Medium Low

Counterpart(in programming languages)

Virtual function Normal function Inline function

/4125

Page 26: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

For the ARM architecture, the instruction (TYPE III)MOVS PC, LR

Changes the program counter and switches to user mode.However, it causes unpredictable behavior when executed in user mode.Therefore, it is a sensitive instruction but not a privileged instruction.

The First Challenge of VirtualizationExample

Sensitive instructions Privileged instructions

/4126

Page 27: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Dynamic Binary Translation

The First Challenge of VirtualizationSolutions (1/2)

BL TLB_FLUSH_DENTRY…

TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR

BL TLB_FLUSH_DENTRY_NEW…

TLB_FLUSH_DENTRY: MCR p15, 0, R0, C8, C6, 1 MOV PC, LR

…TLB_FLUSH_DENTRY_NEW: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB

Translation Basic Block

/4127

Page 28: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Virtualization APIs – hypercalls

The First Challenge of VirtualizationSolutions (2/2)

BL TLB_FLUSH_DENTRY…

TLB_FLUSH_DENTRY: MOV R1, R0 MOV R0, #CMD_FLUSH_DENTRY SWI #HYPER_CALL_TLB

Restore User Context & PC

SWI Handler

Hypercall Handler

……

LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1

/* In Hypervisor */

/* In Guest OS */

/4128

Page 29: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

HypercallGuest OS

Hypervisor

SWI Handler

Hypercalls

Softw

are

Inte

rrupt

Hyper Call Handler

reschedule?

NoYes

context switch

/4129

Page 30: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Idea of Inline EmulationThe Original Instruction

Hypercall

MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1

MOV R0, #CMD_FLUSH_DENTRYMOV R1, VIRT_ADDRSWI #HYPER_CALL_TLB

LDR R1, [SP, #4]MCR p15, 0, R1, C8, C6, 1

Restore User Context & PC

Hypercall Handler

……

Guest OS

Inline Emulation

Restore PC

Inline Emulation Handler

……

Guest OS

MOV R0, VIRT_ADDRMCR p15, 0, R0, C8, C6, 1

/* restore user context */LDMIA SP, [R0 – R14]MCR p15, 0, R0, C8, C6, 1

MCR p15, 0, R0, C8, C6, 1

/4130

Page 31: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Inline EmulationGuest OS

Hypervisor

SWI Handler

Inline Emulation

CanonicalPrivileged

Instructions(TYPE I)

UDI E

xcep

tion

retu

rn to

gue

st

Hypercalls

Softw

are

Inte

rrupt

Hyper Call Handler

reschedule?

NoYes

context switch

UDI Handler

/4131

Page 32: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Design of Inline EmulationThe Main Handler

A handler for the instruction is found

No handler for the instruction was found

/4132

Page 33: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

The Issue of Finding an Inline Emulation Handler

It is hard to find a simple hash function.

Because the encoding of ARM instructions is complicated.

Instead, we can construct an efficient search table.

Because there are a few frequently used instructions.

Instruction Ratio (%)

mcr p15, 0, Rd, c3, c0, 0 58.44

mcr p15, 0, Rd, c7, c14, 1 39.73

mcr p15, 0, Rd, c8, c5, 1 0.49

mcr p15, 0, Rd, c8, c6, 1 0.49

mcr p15, 0, Rd, c7, c10, 4 0.24

mcr p15, 0, Rd, c2, c0, 0 0.23

mcr p15, 0, Rd, c7, c5, 0 0.11

mcr p15, 0, Rd, c8, c5, 0 0.08

mcr p15, 0, Rd, c8, c6, 0 0.08

mrc p15, 0, Rd, c7, c14, 3 0.11

Others <0.01

/4133

Page 34: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Example of Mto1 Search Table

Encoding of MCR instructionSyntax: MCR{cond} cp, op1, Rd, CRn, CRm, op2

mask value handler Set0x0F1F0F10 0x0E130F10 handler_CR3 MCR 15, op1, Rd, c3, CRm, op20x0F1C0F10 0x0E100F10 handler_CR02 MCR 15, op1, Rd, {c0 - c2}, CRm, op20x0F100F10 0x0E100F10 handler_CRX MCR 15, op1, Rd, {c4 - c15}, CRm, op2……0x00000000 0x00000000 0x00000000 End of Table

cond 1110 op1 0 CRn Rd cp op2 1 CRm

31 0

* An entry E is matched if /4134

Page 35: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Design of Inline EmulationDynamic Inline Emulation (DIE) Handler

Self-modifying

inlining the instruction

flushing caches

/4135

Page 36: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Design of Inline EmulationStatic Inline Emulation (SIE) Handler

/* data synchronization barrier */executing the hard-coded instructions

restoring user context & PC

/4136

Page 37: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Emulator Android emulator (ARMv5)

Memory 12MB for the hypervisor

32MB for the guest OS

Hypervisor Xen 4.0.1 for ARMv5

Guest OS Linux 2.6.29-Goldfish

Compilation Using GCC with debug (-g) flag

Evaluation and AnalysisThe Experiment Environment

/4137

Page 38: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Distribution of Emulated Instructions

Instruction CRn, CRm, op2 Ratio(%)MCR c3, c0, 0 58.44

  c7, c14, 1 39.73

  c8, c5, 1 0.49

  c8, c6, 1 0.49

  c7, c10, 4 0.24

  c2, c0, 0 0.23

  c7, c5, 0 0.11

  c8, c5, 0 0.08

  c8, c6, 0 0.08MRC c7, c14, 3 0.11Others <0.01

More than 98%

/4138

Page 39: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Micro-Level Analysis (1/2)

Operation - Invalidating TLB

Mode (instructions) ImprovementPV/IE (%)USER UND SWI Total

A single entry (DIE handler)

PV 13.00 0.00 305.97 318.97613.39

IE 3.00 49.00 0.00 52.00

The entire TLB(SIE handler)

PV 11.00 0.00 305.80 316.80704.01

IE 3.00 42.00 0.00 45.00

/4139

Page 40: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Micro-Level Analysis (2/2)

InstructionMode (instructions) Improvement

PV/IE (%)USER UND SWI Total

MCR p15, 0, Rd, c3, c0, 0(DIE handler)

PV 9.00 0.00 203.29 212.29424.57

IE 3.00 47.00 0.00 50.00

MCR p15, 0, Rd, c7, c14, 1(DIE handler)

PV 13.00 0.00 304.50 317.50566.94

IE 3.00 53.00 0.00 56.00

Inline emulation can achieve at least 4.24X performance of hypercalls in

most cases (about 98%)./4140

Page 41: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Evaluation and AnalysisThe Macro-Level Analysis

DataProcessing

DataTransfer

Branch SoftwareInterrupt

Coprocessorand Other

Total

Paravirtualization(instructions)

89.22M 91.28M 27.08M 48560 4.79M 212.42M

Inline Emulation(instructions)

89.04M 90.66M 26.93M 33658 4.93M 211.59M

(PV – IE) / PV (%) 0.20 0.68 0.53 30.69 -2.72 0.39

/4141

Page 42: 薛智文 cwhsueh@csie.ntu.edu.tw cwhsueh

資工系網媒所 NEWS 實驗室

Inline emulation :Reduces the efforts to port guest operating systemsIncreases the handling of sensitive instructions (4-7x)Increases the overall system performance (0.39%)

Future workOptimization for memory virtualization

Much higher the overall speedup is possible.

Conclusions

/4142