towards full virtualization of heterogeneous noc-based multicore embedded architecture 2012 ieee...

Post on 11-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Towards Full Virtualization of Heterogeneous Noc-based Multicore

Embedded Architecture2012 IEEE 15th International Conference on

Computational Science and Engineering

George Kornaros

Marcello Coppola

曾冠維

2

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

3

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

4

motivation

• A virtualization-ready SoC platform must support the necessary extensions across the HW/SW stack:• applications, programming model, hypervisor and hardware

platform.

• They present the main hardware extensions and architecture of a heterogeneous multicore embedded system supporting virtualization.

5

vIrtical

• vIrtical , a system platform architecture is developed towards a heterogeneous Soc target.

• Three on-chip-networks to implement cache-coherent.

• Using specialized I/O memory management unit(IOMMU).

6

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

7

Host Processor: ARM’s big.LITTLE

(Interrupt controller)1. Switch or migration mode

2. MP mode

8

Memory NoC

• The CCI-400 component implements the AMBA4 ACE protocol (AXI Coherency Extensions) which provides a framework for system level coherence.

• It implements distributed virtual memory (DVM) mechanisms, useful to support virtualization.

• The ACE protocol permits cached copies of the same memory location to reside in the local cache of one or more master components.

9

System NoC: Spidergon STNoC

• Customized packet-switched communication architectures

• Switching, flow control, arbitration and buffering schemes will be inherited from the STNoC architecture.

• In order to support coherence, STNoC will also transport coherence messages, by encapsulating ACE protocol transactions.

10

Spidergon NOC (STNoC)

• The Spidergon network connects a generic even number of nodes N as a bi-directional ring in both clockwise, and anti-clockwise directions with in addition a cross connection for each couple of nodes.

• A low cost architecture and flexible

11

Hardware Accelerator

• A SMP multicore host processor is coupled to accelerators of different kinds:

• GPU-like general-purpose programmable many-cores (GPPA).

• Different types of Hardware Processing Units (HWPU).

• The host processor can leverage to offload data-intensive computational kernels, achieving significant speedup.

12

The vIrtical system architecture

accelerator

13

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

14

I/OMMU architecture

• IOMMU functionality must focus on translating the virtual address space of fully virtualized guest devices to a global physical address space.

• This address translation is implemented efficiently using a paging scheme supported by an associated I/O transaction look aside buffer (I/O TLB).

15

IOMMU Internal Organization

16

Command Processing Engine (CPE)

• The Command Processing Engine (CPE) responsible for interfacing and dispatching incoming commands and performing IOMMU component configuration.

17

I/O translation look-aside buffer (IOTLB)

• The I/O translation look-aside buffer (IOTLB) which accelerates page translation of DMA addresses by avoiding expensive remote loading of page table entries.

18

The Memory Page Table Walker (MPTW)

• The Memory Page Table Walker (MPTW) which accesses system memory to perform address translation in case of an IOTLB cache miss.

19

The IOMMU Device and Domain Table (IODT)

• The IOMMU Device and Domain Table (IODT) which contains configuration data for each device in order to provide proper protection for incoming translation requests.

20

Virtual Machine and Guest OS Protection

• Protection mechanisms:

• Multiple isolated domains can be supported by ensuring that all I/O devices are assigned to some domain (possibly a default domain), and that they can access only physical resources allocated to this domain.

21

Virtual Machine and Guest OS Protection(cont.)

• Two data structures maintain the information contained in the IODT:

• (i) I/O Device Table Control (IODTC)

• (ii) I/O Device Table Domain table(IODTD)

• The Device Table Control (IODTC) is indexed using a 4-bit wide Device ID.

• The IODTD contains 256 entries per device.

22

IOMMU device control and domain table

VM:Virtual MachineAS: Application Space

23

The Hardware Monitoring Unit (HMU)

• The Hardware Monitoring Unit (HMU) includes agents that provide custom circuitry for monitoring particular events

24

IOMMU Monitoring Unit

• Particular event related to:

• (i) internal IOMMU activity.(counter statistics and error logs) .

• (ii) interface transactions(AXI bus).

• These agents can be used to estimate key performance metrics, e.g. by analyzing memory access latency structure, throughput and resource utilization, and help optimize the IOMMU architecture by introducing static configuration.

25

The Device Discovery Unit (DDU)

• The Device Discovery Unit (DDU) is responsible for establishing communication with a newly connected device by exchanging identification .

26

The Interrupt Unit (INTR)

• The Interrupt Unit (INTU) is in charge of generating interrupts in the event of system exceptions.

27

Functional Behavior and Synchronization

28

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

29

Conclusion

• Focusing on hardware-assisted virtualization instead of software.

• A novel hardware memory management unit (IOMMU) is introduced to map DMA virtual addresses to correct VM’s physical memory locations.

• High performance supported by a configurable TLB.

• Enhanced protection by an integrated lightweight hardware monitoring unit.

30

Thank you for your listening

top related