[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEt29t9CK2Muiuyb1s6p2AzgcMiD_z0NVFn1d+KEqBydug@mail.gmail.com>
Date: Tue, 28 Mar 2023 10:58:48 +0800
From: Jason Wang <jasowang@...hat.com>
To: Dominique Martinet <asmadeus@...ewreck.org>
Cc: Albert Huang <huangjie.albert@...edance.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
virtualization@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, Luis Chamberlain <mcgrof@...nel.org>,
v9fs-developer@...ts.sourceforge.net,
Eric Van Hensbergen <ericvh@...il.com>,
Christian Schoenebeck <linux_oss@...debyte.com>
Subject: Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event
idx on get_buf)
On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
<asmadeus@...ewreck.org> wrote:
>
> Hi Michael, Albert,
>
> Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > the vq->event_triggered will be set to true. It will no longer be set to
> > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > virtqueue_enable_cb_prepare.
>
> This patch (commited as 35395770f803 ("virtio_ring: don't update event
> idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
>
> I've just hit had a look at recent patches[1] and reverted this to test
> and I can mount again, so I'm pretty sure this is the culprit, but I
> didn't look at the content at all yet so cannot advise further.
> It might very well be that we need some extra handling for 9p
> specifically that can be added separately if required.
>
> [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
>
>
> This can be reproduced with a simple mount, run qemu with some -virtfs
> argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> these messages:
> 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> 9pnet: -- p9_virtio_request (83): virtio request kicked
>
> So I suspect we're just not getting a callback.
I think so. The patch assumes the driver will call
virtqueue_disable/enable_cb() which is not the case of the 9p driver.
So after the first interrupt, event_triggered will be set to true forever.
Thanks
>
>
> I'll have a closer look after work, but any advice meanwhile will be
> appreciated!
> (I'm sure Luis would also like a temporary drop from -next until
> this is figured out, but I'll leave this up to you)
>
>
> >
> > If we disable the napi_tx, it will only be called when the tx ring
> > buffer is relatively small.
> >
> > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> >
> > To summarize:
> > 1) event_triggered was set to true in vring_interrupt()
> > 2) after this nothing will happen for virtqueue_disable_cb() so
> > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > then it tries to publish new event
> >
> > To fix, if event_triggered is set to true, do not update
> > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> >
> > Tested with iperf:
> > iperf3 tcp stream:
> > vm1 -----------------> vm2
> > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > there are many tx interrupts in vm2.
> > but without event_triggered there are just a few tx interrupts.
> >
> > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > Signed-off-by: Albert Huang <huangjie.albert@...edance.com>
> > Message-Id: <20230321085953.24949-1-huangjie.albert@...edance.com>
> > Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> > ---
> > drivers/virtio/virtio_ring.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index cbeeea1b0439..1c36fa477966 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > /* If we expect an interrupt for the next entry, tell host
> > * by writing event index and flush out the write before
> > * the read in the next get_buf call. */
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > + !vq->event_triggered))
> > virtio_store_mb(vq->weak_barriers,
> > &vring_used_event(&vq->split.vring),
> > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > * by writing event index and flush out the write before
> > * the read in the next get_buf call.
> > */
> > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > + !vq->event_triggered))
> > virtio_store_mb(vq->weak_barriers,
> > &vq->packed.vring.driver->off_wrap,
> > cpu_to_le16(vq->last_used_idx));
>
Powered by blists - more mailing lists