[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1dcec730-ec26-46f4-ba4c-06101fcc599e@redhat.com>
Date: Mon, 25 Mar 2024 17:34:29 +1000
From: Gavin Shan <gshan@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Will Deacon <will@...nel.org>, virtualization@...ts.linux.dev,
linux-kernel@...r.kernel.org, jasowang@...hat.com,
xuanzhuo@...ux.alibaba.com, yihyu@...hat.com, shan.gavin@...il.com,
linux-arm-kernel@...ts.infradead.org,
Catalin Marinas <catalin.marinas@....com>, mochs@...dia.com
Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring
On 3/20/24 17:14, Michael S. Tsirkin wrote:
> On Wed, Mar 20, 2024 at 03:24:16PM +1000, Gavin Shan wrote:
>> On 3/20/24 10:49, Michael S. Tsirkin wrote:>
>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>> index 6f7e5010a673..79456706d0bd 100644
>>> --- a/drivers/virtio/virtio_ring.c
>>> +++ b/drivers/virtio/virtio_ring.c
>>> @@ -685,7 +685,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
>>> /* Put entry in available array (but don't update avail->idx until they
>>> * do sync). */
>>> avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
>>> - vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head);
>>> + u16 headwithflag = head | (q->split.avail_idx_shadow & ~(vq->split.vring.num - 1));
>>> + vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, headwithflag);
>>> /* Descriptors and available array need to be set before we expose the
>>> * new available array entries. */
>>>
Ok, Michael. I continued with my debugging code. It still looks like a
hardware bug on NVidia's grace-hopper. I really think NVidia needs to be
involved for the discussion, as suggested by you.
Firstly, I bind the vhost process and vCPU thread to CPU#71 and CPU#70.
Note that I have only one vCPU in my configuration.
Secondly, the debugging code is enhanced so that the available head for
(last_avail_idx - 1) is read for twice and recorded. It means the available
head for one specific available index is read for twice. I do see the
available heads are different from the consecutive reads. More details
are shared as below.
From the guest side
===================
virtio_net virtio0: output.0:id 86 is not a head!
head to be released: 047 062 112
avail_idx:
000 49665
001 49666 <--
:
015 49664
avail_head:
000 062
001 047 <--
:
015 112
From the host side
==================
avail_idx
000 49663
001 49666 <---
:
avail_head
000 062 (062)
001 047 (047) <---
:
015 086 (112) // head 086 is returned from the first read,
// but head 112 is returned from the second read
vhost_get_vq_desc: Inconsistent head in two read (86 -> 112) for avail_idx 49664
Thanks,
Gavin
Powered by blists - more mailing lists