towards full virtualization of heterogeneous noc-based multicore embedded architecture 2012 ieee...

30
Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science and Engineering George Kornaros Marcello Coppola 1 曾曾曾

Upload: audra-snow

Post on 11-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

1

Towards Full Virtualization of Heterogeneous Noc-based Multicore

Embedded Architecture2012 IEEE 15th International Conference on

Computational Science and Engineering

George Kornaros

Marcello Coppola

曾冠維

Page 2: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

2

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

Page 3: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

3

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

Page 4: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

4

motivation

• A virtualization-ready SoC platform must support the necessary extensions across the HW/SW stack:• applications, programming model, hypervisor and hardware

platform.

• They present the main hardware extensions and architecture of a heterogeneous multicore embedded system supporting virtualization.

Page 5: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

5

vIrtical

• vIrtical , a system platform architecture is developed towards a heterogeneous Soc target.

• Three on-chip-networks to implement cache-coherent.

• Using specialized I/O memory management unit(IOMMU).

Page 6: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

6

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

Page 7: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

7

Host Processor: ARM’s big.LITTLE

(Interrupt controller)1. Switch or migration mode

2. MP mode

Page 8: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

8

Memory NoC

• The CCI-400 component implements the AMBA4 ACE protocol (AXI Coherency Extensions) which provides a framework for system level coherence.

• It implements distributed virtual memory (DVM) mechanisms, useful to support virtualization.

• The ACE protocol permits cached copies of the same memory location to reside in the local cache of one or more master components.

Page 9: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

9

System NoC: Spidergon STNoC

• Customized packet-switched communication architectures

• Switching, flow control, arbitration and buffering schemes will be inherited from the STNoC architecture.

• In order to support coherence, STNoC will also transport coherence messages, by encapsulating ACE protocol transactions.

Page 10: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

10

Spidergon NOC (STNoC)

• The Spidergon network connects a generic even number of nodes N as a bi-directional ring in both clockwise, and anti-clockwise directions with in addition a cross connection for each couple of nodes.

• A low cost architecture and flexible

Page 11: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

11

Hardware Accelerator

• A SMP multicore host processor is coupled to accelerators of different kinds:

• GPU-like general-purpose programmable many-cores (GPPA).

• Different types of Hardware Processing Units (HWPU).

• The host processor can leverage to offload data-intensive computational kernels, achieving significant speedup.

Page 12: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

12

The vIrtical system architecture

accelerator

Page 13: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

13

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

Page 14: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

14

I/OMMU architecture

• IOMMU functionality must focus on translating the virtual address space of fully virtualized guest devices to a global physical address space.

• This address translation is implemented efficiently using a paging scheme supported by an associated I/O transaction look aside buffer (I/O TLB).

Page 15: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

15

IOMMU Internal Organization

Page 16: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

16

Command Processing Engine (CPE)

• The Command Processing Engine (CPE) responsible for interfacing and dispatching incoming commands and performing IOMMU component configuration.

Page 17: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

17

I/O translation look-aside buffer (IOTLB)

• The I/O translation look-aside buffer (IOTLB) which accelerates page translation of DMA addresses by avoiding expensive remote loading of page table entries.

Page 18: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

18

The Memory Page Table Walker (MPTW)

• The Memory Page Table Walker (MPTW) which accesses system memory to perform address translation in case of an IOTLB cache miss.

Page 19: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

19

The IOMMU Device and Domain Table (IODT)

• The IOMMU Device and Domain Table (IODT) which contains configuration data for each device in order to provide proper protection for incoming translation requests.

Page 20: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

20

Virtual Machine and Guest OS Protection

• Protection mechanisms:

• Multiple isolated domains can be supported by ensuring that all I/O devices are assigned to some domain (possibly a default domain), and that they can access only physical resources allocated to this domain.

Page 21: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

21

Virtual Machine and Guest OS Protection(cont.)

• Two data structures maintain the information contained in the IODT:

• (i) I/O Device Table Control (IODTC)

• (ii) I/O Device Table Domain table(IODTD)

• The Device Table Control (IODTC) is indexed using a 4-bit wide Device ID.

• The IODTD contains 256 entries per device.

Page 22: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

22

IOMMU device control and domain table

VM:Virtual MachineAS: Application Space

Page 23: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

23

The Hardware Monitoring Unit (HMU)

• The Hardware Monitoring Unit (HMU) includes agents that provide custom circuitry for monitoring particular events

Page 24: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

24

IOMMU Monitoring Unit

• Particular event related to:

• (i) internal IOMMU activity.(counter statistics and error logs) .

• (ii) interface transactions(AXI bus).

• These agents can be used to estimate key performance metrics, e.g. by analyzing memory access latency structure, throughput and resource utilization, and help optimize the IOMMU architecture by introducing static configuration.

Page 25: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

25

The Device Discovery Unit (DDU)

• The Device Discovery Unit (DDU) is responsible for establishing communication with a newly connected device by exchanging identification .

Page 26: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

26

The Interrupt Unit (INTR)

• The Interrupt Unit (INTU) is in charge of generating interrupts in the event of system exceptions.

Page 27: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

27

Functional Behavior and Synchronization

Page 28: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

28

Outline

• Introduction

• System architecture

• I/OMMU architecture

• Conclusions

Page 29: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

29

Conclusion

• Focusing on hardware-assisted virtualization instead of software.

• A novel hardware memory management unit (IOMMU) is introduced to map DMA virtual addresses to correct VM’s physical memory locations.

• High performance supported by a configurable TLB.

• Enhanced protection by an integrated lightweight hardware monitoring unit.

Page 30: Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational Science

30

Thank you for your listening