devconf2017 - can vms networking benefit from dpdk

34
Can VMs networking benefit from DPDK? Virtio/Vhost-user status & updates Maxime Coquelin – Victor Kaplansky 2017-01-27

Upload: maxime-coquelin

Post on 07-Feb-2017

36 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Devconf2017 - Can VMs networking benefit from DPDK

Can VMs networking benefit from DPDK?

Virtio/Vhost-user status & updates

Maxime Coquelin – Victor Kaplansky2017-01-27

Page 2: Devconf2017 - Can VMs networking benefit from DPDK

2

AGENDACan VMs networking benefit from DPDK?

● Overview● Challenges● New & upcoming features

Page 3: Devconf2017 - Can VMs networking benefit from DPDK

Overview

Page 4: Devconf2017 - Can VMs networking benefit from DPDK

4

DPDK – project overviewOverview

DPDK is a set of userspace libraries aimed at fast packet processing.

● Data Plane Development Kit● Goal:

● Benefit from software flexibility● Achieving performance close to dedicated HW solutions

Page 5: Devconf2017 - Can VMs networking benefit from DPDK

5

DPDK – project overviewOverview

● License: BSD● CPU architectures: x86, Power8, TILE-Gx & ARM● NICs: Intel, Mellanox, Broadcom, Cisco,...● Other HW: Crypto, SCSI for SPDK project● Operating systems: Linux, BSD

Page 6: Devconf2017 - Can VMs networking benefit from DPDK

6

1st release by Intel asZip file

2012 2013

6Wind intiatesdpdk.orgcommunity

DPDK - project historyOverview

Packaged In Fedora

2014 2015

Power8 &TILE-Gxsupport

ARMSupport /Crypto

2016 2017

Moving toLinux Foundation

v1.2v1.3

v1.4v1.5

v1.6v1.7

v1.8v2.0

v2.1v2.2

v16.04

v16.07

v16.11v17.02

v17.05

v17.08

v17.11

v16.11: ~750K LoC / ~6000 commits / ~350 contributors

Page 7: Devconf2017 - Can VMs networking benefit from DPDK

7

DPDK – comparisonOverview

UserKernel

NIC

Application

Socket

Net stack

Driver

UserKernel

NIC

Application

DPDK

VFIO

Page 8: Devconf2017 - Can VMs networking benefit from DPDK

8

DPDK – performanceOverview

DPDK uses:● CPU isolation/partitioning & polling

→ Dedicated CPU cores to poll the device● VFIO/UIO

→ Direct devices registers accesses from user-space● NUMA awareness

● → Resources local to the Poll-Mode Driver’s (PMD) CPU● Hugepages

→ Less TLB misses, no swap

Page 9: Devconf2017 - Can VMs networking benefit from DPDK

9

DPDK – performanceOverview

To avoid:● Interrupt handling

→ Kernel’s NAPI polling mode is not enough● Context switching● Kernel/user data copies● Syscalls overhead

→ More than the time budget for a 64B packet at 14.88Mpps

Page 10: Devconf2017 - Can VMs networking benefit from DPDK

10

DPDK - componentsOverview

Credits: Tim O’Driscoll - Intel

Page 11: Devconf2017 - Can VMs networking benefit from DPDK

11

Virtio/VhostOverview

● Device emulation, direct assignment, VirtIO● Vhost: In-kernel virtio device emulation● device emulation code calls to directly call into kernel

subsystems● Vhost worker thread in host kernel● Bypasses system calls from user to kernel space on host

Page 12: Devconf2017 - Can VMs networking benefit from DPDK

12

Vhost driver model

VM

QEMU Userspace

Kvm.ko

ioeventfd

irqfd

ioeventfd

Vhost

Host Kernel

Page 13: Devconf2017 - Can VMs networking benefit from DPDK

13

In-kernel device emulation

● In-kernel restricted to virtqueue emulation● QEMU handles control plane, feature negotiation, migration,

etc● File descriptor polling done by vhost in kernel● Buffers moved between tap device and virtqueues by kernel

worker thread

Overview

Page 14: Devconf2017 - Can VMs networking benefit from DPDK

14

Vhost as user space interface

● Vhost architecture is not tied to KVM● Backend: Vhost instance in user space ● Eventfd is set up to signal backend when new buffers are

placed by guest (kickfd)● Irqfd is set up to signal the guest about new buffers placed by

backend (callfd)● The beauty: backend only knows about guest memory

mapping, kick eventfd and call eventfd● Vhost-user implemented in DPDK in v16.7

Overview

Page 15: Devconf2017 - Can VMs networking benefit from DPDK

Challenges

Page 16: Devconf2017 - Can VMs networking benefit from DPDK

16

PerformanceChallenges

DPDK is about performance, which is a trade-off between: ● Bandwidth → achieving line rate even for small packets● Latency → as low as possible (of course)● CPU utilization → $$$

→ Prefer bandwidth & latency at the expense of CPU utilization

→ Take into account HW architectures as much as possible

Page 17: Devconf2017 - Can VMs networking benefit from DPDK

17

0% packet-loss• Some use-cases of Virtio cannot afford packet loss, like NFV

• Hard to achieve max perf without loss, as Virtio is CPU intensive

→ Scheduling “glitches” may cause packets drop

Migration• Requires restoration of internal state including the backend

• Interface exposed by QEMU must stay unchanged for cross-version migration

• Interface exposed to guest depends on capabilities of third-party application

• Support from the management tool is required

ReliabilityChallenges

Page 18: Devconf2017 - Can VMs networking benefit from DPDK

18

• Isolation of untrusted guests• Direct access to device from untrusted guests• Current implementations require mediator for guest -to-

guest communication.• Zero-copy is problematic from security point of view

SecurityChallenges

Page 19: Devconf2017 - Can VMs networking benefit from DPDK

New & upcoming features

Page 20: Devconf2017 - Can VMs networking benefit from DPDK

20

Pro:• Allows receiving packets larger

than descriptors’ buffer size

Con:• Introduce extra-cache miss in the

dequeue path

Rx mergeable buffersNew & upcoming features

Desc 0

Desc 1

Desc 2

Desc 3

Desc 4

Desc n

Desc n-1

num_buffers = 1

num_buffers = 3

Page 21: Devconf2017 - Can VMs networking benefit from DPDK

21

Indirect descriptors (DPDK v16.11)New & upcoming features

Desc 0

Desc 1

Desc 2

Desc 3

Desc 4

Desc n

Desc n-1

Desc 0

Desc 1

Desc 2

Desc 3

Desc 4

Desc n

Desc n-1

iDesc 2-0

iDesc 2-1

iDesc 2-2

iDesc 2-3

Direct descriptors chaining Indirect descriptors table

Page 22: Devconf2017 - Can VMs networking benefit from DPDK

22

Pros:• Increase ring capacity• Improve performance for large number of large requests• Improve 0% packet loss perf even for small requests

→ If system is not fine-tuned

→ If Virtio headers are in dedicated descriptor

Cons:• One more level of indirection

→ Impacts raw performance (~-3%)

Indirect descriptors (DPDK v16.11)New & upcoming features

Page 23: Devconf2017 - Can VMs networking benefit from DPDK

23

Vhost dequeue 0-copy (DPDK v16.11)New & upcoming features

addrlenflagsnext

descriptordesc's buf mbuf 's buf

addrpaddrlen...

mbuf

Memcpy

addrlenflagsnext

descriptordesc's buf

addrpaddrlen...

mbuf

Defaultdequeuing

Zero-copydequeuing

Page 24: Devconf2017 - Can VMs networking benefit from DPDK

24

Pros: • Big perf improvement for standard & large packet sizes

→ More than +50% for VM-to-VM with iperf benchs • Reduces memory footprint

Cons:• Performance degradation for small packets

→ But disabled by default• Only for VM-to-VM using Vhost lib API (No PMD support)• Does not work for VM-to-NIC

→ Mbuf lacks release notif mechanism / No headroom

Vhost dequeue 0-copy (DPDK v16.11)New & upcoming features

Page 25: Devconf2017 - Can VMs networking benefit from DPDK

25

Way for the host to share its max supported MTU• Can be used to set MTU values across the infra• Can improve performance for small packet

→ If MTU fits in rx buffer size, disable Rx mergeable buffers

→ Save one cache-miss when parsing the virtio-net header

MTU feature (DPDK v17.05?)New & upcoming features

Page 26: Devconf2017 - Can VMs networking benefit from DPDK

26

Vhost-pci (DPDK v17.05?)New & upcoming features

vSwitch

VM1

Virtio

Vhost

VM2

Virtio

Vhost

VM1

Vho

st-p

ci

VM2

Virtio

Traditional VM to VM

communication

Direct VM to VM

communication

Page 27: Devconf2017 - Can VMs networking benefit from DPDK

27

Vhost-pci (DPDK v17.05?)New & upcoming features

VM1

Vhost-pci driver 1

VM2

Virtio-pci driver

Vhost-pci device 1

Virtio-pci deviceVhost-pci clientVhost-pci server

Vhost-pci protocol

Page 28: Devconf2017 - Can VMs networking benefit from DPDK

28

Pros:• Performance improvement

→ The 2 VMs share the same virtqueues

→ Packets doesn’t go through host’s vSwitch• No change needed in Virtio’s guest drivers

Vhost-pci (DPDK v17.05?)New & upcoming features

Page 29: Devconf2017 - Can VMs networking benefit from DPDK

29

Cons:• Security

→ Vhost-pci’s VM maps all Virtio-pci’s VM memory space

→ Could be solved with IOTLB support• Live migration

→ Not supported in current version

→ Hard to implement as VMs are connected to each other through a socket

Vhost-pci (DPDK v17.05?)New & upcoming features

Page 30: Devconf2017 - Can VMs networking benefit from DPDK

30

IOTLB in kernelNew & upcoming features

VM

IOMMU driver

VM

QEMU VHOST

IOMMU Device IOTLB

iovaIOTLB miss

IOTLB update, invalidate

TLB invalidation

hva

memory

Syscall/eventfd

Page 31: Devconf2017 - Can VMs networking benefit from DPDK

31

IOTLB for vhost-userNew & upcoming features

VM

IOMMU driver

VM

QEMU Backend

IOMMU IOTLB Cache

iovaIOTLB miss

IOTLB update, invalidate

TLB invalidation

hva

memory

Unix Socket

Page 32: Devconf2017 - Can VMs networking benefit from DPDK

32

• DPDK support for VM is in active development• 10M pps is real in VM with DPDK• New features to boost performance of VM networking• Accelerate transition to NFV / SDN

Conclusions

Page 33: Devconf2017 - Can VMs networking benefit from DPDK

33

Q / A

Page 34: Devconf2017 - Can VMs networking benefit from DPDK

THANK YOU