lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 Sep 2023 22:46:02 +0200
From:   Jakub Sitnicki <jakub@...udflare.com>
To:     virtualization@...ts.linux-foundation.org
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
        linux-kernel@...r.kernel.org, kernel-team@...udflare.com
Subject: [PATCH 0/2] Support multiple interrupts for virtio over MMIO devices

# Intro

This patch set enables virtio-mmio devices to use multiple interrupts.

The elevator pitch would be:

"""
To keep the complexity down to a minimum, but at the same time get to the
same performance level as virtio-pci devices, we:

1) keep using the legacy interrupts, and
2) have a predefined, device type specific, mapping of IRQs to virtqueues,
3) rely on vhost offload for both data and notifications (irqfd/ioeventfd).
"""

As this is an RFC, we aim to (i) present our use-case, and (ii) get an idea
if we are going in the right direction.

Otherwise we have kept changes down to a working minimum, where we can
already demonstrate the performance benefits.

At this point, we did not:
- draft any change proposals to VIRTIO spec, or
- added support for other virtio-mmio driver "configuration backends" than
  the kernel command line (that is ACPI and DT).

# Motivation

This work aims to enable lightweight VMs (like QEMU microvm, Firecracker,
Cloud Hypervisor), which rely on virtio MMIO transport, to utilize
multi-queue virtio NIC to their full potential when multiple vCPUs are
available.

Currently with MMIO transport, it is not possible to process vNIC queue
events in parallel because there is just one interrupt perf virtio-mmio
device, and hence one CPU handling processing the virtqueue events.

We are looking to change that, so that the vNIC performance (measured in
pps) scales together with the number of vNIC queues, and allocated vCPUs.

Our goal is to reach the same pps level as virtio-pci vNIC delivers today.

# Prior Work

So far we have seen two attempts making virtio-mmio devices use multiple
IRQs. First in 2014 [1], then in 2020 [2]. At least that is all we could
find.

Gathering from discussions and review feedback, the pitfalls in the
previous submissions were:

1. lack of proof that there are performance benefits (see [1]),
2. code complexity (see [2]),
3. no reference VMM (QEMU) implementation ([1] and [2]).

We try not to repeat these mistakes.

[1] https://lore.kernel.org/r/1415093712-15156-1-git-send-email-zhaoshenglong@huawei.com/
[2] https://lore.kernel.org/r/cover.1581305609.git.zhabin@linux.alibaba.com/

# Benchmark Setup and Results

Traffic flow:

host -> guest (reflect in XDP native) -> host

host-guest-host with XDP program reflecting UDP packets is just one of
production use-cases we are interested in.

Another one is a typical host-to-guest scenario, where UDP flows are
terminated in the guest. The latter, however, takes more work to benchmark
because it requires manual sender throttling to avoid very high loses on
receiver.

Setup details:

- guest:
  - Linux v6.5 + this patchset
  - 8 vCPUs
  - 16 vNIC queues (8 in use + 8 for lockless XDP TX)
- host
  - VMM - QEMU v8.1.0 + PoC changes (see below)
  - vhost offload enabled
  - iperf3 v3.12 used as sender and receiver
- traffic pattern
  - 8 uni-directional, small-packet UDP flows
  - flow steering - one flow per vNIC RX queue
- CPU affinity
  - iperf clients, iperfs servers, KVM vCPU threads, vhost threads pinned to
    their own logical CPUs
  - all used logical CPUs on the same NUMA node

Recorded receiver pps:

                      virtio-pci      virtio-mmio     virtio-mmio
                      8+8+1 IRQs      8 IRQs          1 IRQ

 rx pps (mean ± rsd): 217,743 ± 2.4%  221,741 ± 2.7%  48,910 ± 0.03%
pkt loss (min … max):    1.8% … 2.3%     2.9% … 3.6%   82.1% … 89.3%

rx pps is the average over 8 receivers, each receiving one UDP flow.
pkt loss is not aggregated. Loss for each UDP flow is within the range.

If anyone would like to reproduce these results, we would be happy to share
detailed setup steps and tooling (scripts).

# PoC QEMU changes

QEMU is the only known to us VMM where we can compare the performance of
both virtio PCI and MMIO transport with a multi-queue virtio NIC and vhost
offload.

Hence, accompanying this patches, we also have a rather raw, and not yet
review ready, QEMU code changes that we used to test and benchmark
virtio-mmio device with multiple IRQs.

The tag with changes is available at:

https://github.com/jsitnicki/qemu/commits/virtio-mmio-multi-irq-rfc1

# Open Questions

- Do we need a feature flag, for example VIRTIO_F_MULTI_IRQ, for the guest to
  inform the VMM that it understands the feature?

  Or can we assume that VMM assigns multiple IRQs to virtio-mmio device only if
  guest is compatible?

Looking forward to your feedback.

Jakub Sitnicki (2):
      virtio-mmio: Parse a range of IRQ numbers passed on the command line
      virtio-mmio: Support multiple interrupts per device

 drivers/virtio/virtio_mmio.c | 179 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 135 insertions(+), 44 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ