[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251117174525.3690c712.alex@shazbot.org>
Date: Mon, 17 Nov 2025 17:45:25 -0700
From: Alex Williamson <alex@...zbot.org>
To: <ankita@...dia.com>
Cc: <jgg@...pe.ca>, <yishaih@...dia.com>, <skolothumtho@...dia.com>,
<kevin.tian@...el.com>, <aniketa@...dia.com>, <vsethi@...dia.com>,
<mochs@...dia.com>, <Yunxiang.Li@....com>, <yi.l.liu@...el.com>,
<zhangdongdong@...incomputing.com>, <avihaih@...dia.com>,
<bhelgaas@...gle.com>, <peterx@...hat.com>, <pstanner@...hat.com>,
<apopple@...dia.com>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <cjia@...dia.com>, <kwankhede@...dia.com>,
<targupta@...dia.com>, <zhiw@...dia.com>, <danw@...dia.com>,
<dnigam@...dia.com>, <kjaju@...dia.com>
Subject: Re: [PATCH v1 6/6] vfio/nvgrace-gpu: vfio/nvgrace-gpu: wait for the
GPU mem to be ready
On Mon, 17 Nov 2025 12:41:59 +0000
<ankita@...dia.com> wrote:
> From: Ankit Agrawal <ankita@...dia.com>
>
> Speculative prefetches from CPU to GPU memory until the GPU
> is not ready after reset can cause harmless corrected RAS events
> to be logged. It is thus expected that the mapping not be
> re-established until the GPU is ready post reset.
>
> Wait for the GPU to be ready on the first fault before establishing
> CPU mapping to the GPU memory. The GPU readiness can be checked
> through BAR0 registers as is already being done at the device probe.
>
> The state is checked on the first fault/huge_fault request using
> a flag. Unset the flag on every reset request.
>
> So intercept the following calls to the GPU reset, unset
> gpu_mem_mapped. Then use it to determine whether to wait before
> mapping.
> 1. VFIO_DEVICE_RESET ioctl call
> 2. FLR through config space.
If we need a stall after reset based on some device specific readiness
criteria, shouldn't we just implement a device specific reset? We can
create a reset callback that uses pcie_reset_flr() then pci_iomap()s
the BAR to poll the device. See for example delay_250ms_after_flr()
and nvme_disable_and_flr(). Thanks,
Alex
Powered by blists - more mailing lists