lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO55csx=QqqZFB8jEKerVRjwSTKRoLZuGhuCOL3yR6+q0vriDg@mail.gmail.com>
Date: Wed, 15 Oct 2025 11:16:52 +0200
From: Maxime Coquelin <mcoqueli@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Eugenio Perez Martin <eperezma@...hat.com>, Yongji Xie <xieyongji@...edance.com>, 
	virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org, 
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Dragos Tatulea DE <dtatulea@...dia.com>, jasowang@...hat.com
Subject: Re: [RFC 1/2] virtio_net: timeout control virtqueue commands

On Wed, Oct 15, 2025 at 10:09 AM Michael S. Tsirkin <mst@...hat.com> wrote:
>
> On Wed, Oct 15, 2025 at 10:03:49AM +0200, Maxime Coquelin wrote:
> > On Wed, Oct 15, 2025 at 9:45 AM Eugenio Perez Martin
> > <eperezma@...hat.com> wrote:
> > >
> > > On Wed, Oct 15, 2025 at 9:05 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> > > >
> > > > On Wed, Oct 15, 2025 at 08:52:50AM +0200, Eugenio Perez Martin wrote:
> > > > > On Wed, Oct 15, 2025 at 8:33 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> > > > > >
> > > > > > On Wed, Oct 15, 2025 at 08:08:31AM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Tue, Oct 14, 2025 at 11:25 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Oct 14, 2025 at 11:14:40AM +0200, Maxime Coquelin wrote:
> > > > > > > > > On Tue, Oct 14, 2025 at 10:29 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, Oct 07, 2025 at 03:06:21PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > An userland device implemented through VDUSE could take rtnl forever if
> > > > > > > > > > > the virtio-net driver is running on top of virtio_vdpa.  Let's break the
> > > > > > > > > > > device if it does not return the buffer in a longer-than-assumible
> > > > > > > > > > > timeout.
> > > > > > > > > >
> > > > > > > > > > So now I can't debug qemu with gdb because guest dies :(
> > > > > > > > > > Let's not break valid use-cases please.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Instead, solve it in vduse, probably by handling cvq within
> > > > > > > > > > kernel.
> > > > > > > > >
> > > > > > > > > Would a shadow control virtqueue implementation in the VDUSE driver work?
> > > > > > > > > It would ack systematically messages sent by the Virtio-net driver,
> > > > > > > > > and so assume the userspace application will Ack them.
> > > > > > > > >
> > > > > > > > > When the userspace application handles the message, if the handling fails,
> > > > > > > > > it somehow marks the device as broken?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Maxime
> > > > > > > >
> > > > > > > > Yes but it's a bit more convoluted  than just acking them.
> > > > > > > > Once you use the buffer you can get another one and so on
> > > > > > > > with no limit.
> > > > > > > > One fix is to actually maintain device state in the
> > > > > > > > kernel, update it, and then notify userspace.
> > > > > > > >
> > > > > > >
> > > > > > > I thought of implementing this approach at first, but it has two drawbacks.
> > > > > > >
> > > > > > > The first one: it's racy. Let's say the driver updates the MAC filter,
> > > > > > > VDUSE timeout occurs, the guest receives the fail, and then the device
> > > > > > > replies with an OK. There is no way for the device or VDUSE to update
> > > > > > > the driver.
> > > > > >
> > > > > > There's no timeout. Kernel can guarantee executing all requests.
> > > > > >
> > > > >
> > > > > I don't follow this. How should the VDUSE kernel module act if the
> > > > > VDUSE userland device does not use the CVQ buffer then?
> > > >
> > > > First I am not sure a VQ is the best interface for talking to userspace.
> > > > But assuming yes - just avoid sending more data, send it later after
> > > > userspace used the buffer.
> > > >
> > >
> > > Let me take a step back, I think I didn't describe the scenario well enough.
> > >
> > > We have a VDUSE device, and then the same host is interacting with the
> > > device through the virtio_net driver over virtio_vdpa.
> > >
> > > Then, the virtio_net driver sends a control command though its CVQ, so
> > > it *takes the RTNL*. That command reaches the VDUSE CVQ.
> > >
> > > It does not matter if the VDUSE device in the userland processes the
> > > commands through a CVQ, reading the vduse character device, or another
> > > system. The question is: what to do if the VDUSE device does not
> > > process that command in a timely manner? Should we just let the RTNL
> > > be taken forever?
> > >
> >
> > My understanding is that:
> > 1. Virtio-net sends a control messages, waits for reply
> > 2. VDUSE driver dequeues it, adds it to the SCVQ, replies OK to the CVQ
> > 3. Userspace application dequeues the message from the SCVQ
> >  a. If handling is successful it replies OK
> >  b. If handling fails, replies ERROR
> > 4. VDUSE driver reads the reply
> >  a. if OK, do nothing
> >  b. if ERROR, mark the device as broken?
> >
> > This is simplified as it does not take into account SCVQ overflow if
> > the application is stuck.
> > If IIUC, Michael suggests to only enqueue a single message at the time
> > in the SVQ,
> > and bufferize the pending messages in the VDUSE driver.
>
> Not exactly bufferize, record.  E.g. we do not need to send
> 100 messages to enable/disable promisc mode - together they
> have no effect.

The downside of such optimization is that it requires the VDUSE Kernel driver
to be able to handle all the message types.

So every time we add support for a new control message type, we'll also have
to patch VDUSE Kernel driver.

I am not sure the gain is worth the effort as the traffic on the
control queue is
usually rather low?

Maxime

>
> --
> MST
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ