lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEsFc-URhXBCGZ1=CTMZKcWPf57pYy1TcyKLL=N65u+F0Q@mail.gmail.com>
Date: Mon, 7 Apr 2025 10:14:51 +0800
From: Jason Wang <jasowang@...hat.com>
To: Jon Kohler <jon@...anix.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, Eugenio Pérez <eperezma@...hat.com>, 
	kvm@...r.kernel.org, virtualization@...ts.linux.dev, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] vhost/net: remove zerocopy support

On Fri, Apr 4, 2025 at 10:24 PM Jon Kohler <jon@...anix.com> wrote:
>
> Commit 098eadce3c62 ("vhost_net: disable zerocopy by default") disabled
> the module parameter for the handle_tx_zerocopy path back in 2019,
> nothing that many downstream distributions (e.g., RHEL7 and later) had
> already done the same.
>
> Both upstream and downstream disablement suggest this path is rarely
> used.
>
> Testing the module parameter shows that while the path allows packet
> forwarding, the zerocopy functionality itself is broken. On outbound
> traffic (guest TX -> external), zerocopy SKBs are orphaned by either
> skb_orphan_frags_rx() (used with the tun driver via tun_net_xmit())

This is by design to avoid DOS.

> or
> skb_orphan_frags() elsewhere in the stack,

Basically zerocopy is expected to work for guest -> remote case, so
could we still hit skb_orphan_frags() in this case?

> as vhost_net does not set
> SKBFL_DONT_ORPHAN.
>
> Orphaning enforces a memcpy and triggers the completion callback, which
> increments the failed TX counter, effectively disabling zerocopy again.
>
> Even after addressing these issues to prevent SKB orphaning and error
> counter increments, performance remains poor. By default, only 64
> messages can be zerocopied, which is immediately exhausted by workloads
> like iperf, resulting in most messages being memcpy'd anyhow.
>
> Additionally, memcpy'd messages do not benefit from the XDP batching
> optimizations present in the handle_tx_copy path.
>
> Given these limitations and the lack of any tangible benefits, remove
> zerocopy entirely to simplify the code base.
>
> Signed-off-by: Jon Kohler <jon@...anix.com>

Any chance we can fix those issues? Actually, we had a plan to make
use of vhost-net and its tx zerocopy (or even implement the rx
zerocopy) in pasta.

Eugenio may explain more here.

Thanks


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ