[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJaqyWe-mn4e+1egNCH+R1x4R7DB6U1SZ-mRAXYPTtA27hKCVA@mail.gmail.com>
Date: Wed, 15 Oct 2025 08:08:31 +0200
From: Eugenio Perez Martin <eperezma@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Maxime Coquelin <mcoqueli@...hat.com>, Yongji Xie <xieyongji@...edance.com>,
virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Dragos Tatulea DE <dtatulea@...dia.com>, jasowang@...hat.com
Subject: Re: [RFC 1/2] virtio_net: timeout control virtqueue commands
On Tue, Oct 14, 2025 at 11:25 AM Michael S. Tsirkin <mst@...hat.com> wrote:
>
> On Tue, Oct 14, 2025 at 11:14:40AM +0200, Maxime Coquelin wrote:
> > On Tue, Oct 14, 2025 at 10:29 AM Michael S. Tsirkin <mst@...hat.com> wrote:
> > >
> > > On Tue, Oct 07, 2025 at 03:06:21PM +0200, Eugenio Pérez wrote:
> > > > An userland device implemented through VDUSE could take rtnl forever if
> > > > the virtio-net driver is running on top of virtio_vdpa. Let's break the
> > > > device if it does not return the buffer in a longer-than-assumible
> > > > timeout.
> > >
> > > So now I can't debug qemu with gdb because guest dies :(
> > > Let's not break valid use-cases please.
> > >
> > >
> > > Instead, solve it in vduse, probably by handling cvq within
> > > kernel.
> >
> > Would a shadow control virtqueue implementation in the VDUSE driver work?
> > It would ack systematically messages sent by the Virtio-net driver,
> > and so assume the userspace application will Ack them.
> >
> > When the userspace application handles the message, if the handling fails,
> > it somehow marks the device as broken?
> >
> > Thanks,
> > Maxime
>
> Yes but it's a bit more convoluted than just acking them.
> Once you use the buffer you can get another one and so on
> with no limit.
> One fix is to actually maintain device state in the
> kernel, update it, and then notify userspace.
>
I thought of implementing this approach at first, but it has two drawbacks.
The first one: it's racy. Let's say the driver updates the MAC filter,
VDUSE timeout occurs, the guest receives the fail, and then the device
replies with an OK. There is no way for the device or VDUSE to update
the driver.
The second one, what to do when the VDUSE cvq runs out of descriptors?
While the driver has its descriptor returned with VIRTIO_NET_ERR, the
VDUSE CVQ has the descriptor available. If this process repeats to
make available all of the VDUSE CVQ descriptors, how can we proceed?
I think both of them can be solved with the DEVICE_NEEDS_RESET status
bit, but it is not implemented in the drivers at this moment.
Powered by blists - more mailing lists