[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEsFc-URhXBCGZ1=CTMZKcWPf57pYy1TcyKLL=N65u+F0Q@mail.gmail.com>
Date: Mon, 7 Apr 2025 10:14:51 +0800
From: Jason Wang <jasowang@...hat.com>
To: Jon Kohler <jon@...anix.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, Eugenio Pérez <eperezma@...hat.com>,
kvm@...r.kernel.org, virtualization@...ts.linux.dev, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] vhost/net: remove zerocopy support
On Fri, Apr 4, 2025 at 10:24 PM Jon Kohler <jon@...anix.com> wrote:
>
> Commit 098eadce3c62 ("vhost_net: disable zerocopy by default") disabled
> the module parameter for the handle_tx_zerocopy path back in 2019,
> nothing that many downstream distributions (e.g., RHEL7 and later) had
> already done the same.
>
> Both upstream and downstream disablement suggest this path is rarely
> used.
>
> Testing the module parameter shows that while the path allows packet
> forwarding, the zerocopy functionality itself is broken. On outbound
> traffic (guest TX -> external), zerocopy SKBs are orphaned by either
> skb_orphan_frags_rx() (used with the tun driver via tun_net_xmit())
This is by design to avoid DOS.
> or
> skb_orphan_frags() elsewhere in the stack,
Basically zerocopy is expected to work for guest -> remote case, so
could we still hit skb_orphan_frags() in this case?
> as vhost_net does not set
> SKBFL_DONT_ORPHAN.
>
> Orphaning enforces a memcpy and triggers the completion callback, which
> increments the failed TX counter, effectively disabling zerocopy again.
>
> Even after addressing these issues to prevent SKB orphaning and error
> counter increments, performance remains poor. By default, only 64
> messages can be zerocopied, which is immediately exhausted by workloads
> like iperf, resulting in most messages being memcpy'd anyhow.
>
> Additionally, memcpy'd messages do not benefit from the XDP batching
> optimizations present in the handle_tx_copy path.
>
> Given these limitations and the lack of any tangible benefits, remove
> zerocopy entirely to simplify the code base.
>
> Signed-off-by: Jon Kohler <jon@...anix.com>
Any chance we can fix those issues? Actually, we had a plan to make
use of vhost-net and its tx zerocopy (or even implement the rx
zerocopy) in pasta.
Eugenio may explain more here.
Thanks
Powered by blists - more mailing lists