lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251126150323.3b39e1f2.alex@shazbot.org>
Date: Wed, 26 Nov 2025 15:03:23 -0700
From: Alex Williamson <alex@...zbot.org>
To: <ankita@...dia.com>
Cc: <jgg@...pe.ca>, <yishaih@...dia.com>, <skolothumtho@...dia.com>,
 <kevin.tian@...el.com>, <aniketa@...dia.com>, <vsethi@...dia.com>,
 <mochs@...dia.com>, <Yunxiang.Li@....com>, <yi.l.liu@...el.com>,
 <zhangdongdong@...incomputing.com>, <avihaih@...dia.com>,
 <bhelgaas@...gle.com>, <peterx@...hat.com>, <pstanner@...hat.com>,
 <apopple@...dia.com>, <kvm@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>, <cjia@...dia.com>, <kwankhede@...dia.com>,
 <targupta@...dia.com>, <zhiw@...dia.com>, <danw@...dia.com>,
 <dnigam@...dia.com>, <kjaju@...dia.com>
Subject: Re: [PATCH v8 6/6] vfio/nvgrace-gpu: wait for the GPU mem to be
 ready

On Wed, 26 Nov 2025 19:28:46 +0000
<ankita@...dia.com> wrote:
> +/*
> + * If the GPU memory is accessed by the CPU while the GPU is not ready
> + * after reset, it can cause harmless corrected RAS events to be logged.
> + * Make sure the GPU is ready before establishing the mappings.
> + */
> +static int
> +nvgrace_gpu_check_device_ready(struct nvgrace_gpu_pci_core_device *nvdev)
> +{
> +	struct vfio_pci_core_device *vdev = &nvdev->core_device;
> +	int ret;
> +
> +	lockdep_assert_held_read(&vdev->memory_lock);
> +
> +	if (!nvdev->reset_done)
> +		return 0;
> +
> +	ret = nvgrace_gpu_wait_device_ready(vdev->barmap[0]);
> +	if (ret)
> +		return ret;
> +
> +	nvdev->reset_done = false;
> +
> +	return 0;
> +}

It seems like we can call wait_device_ready here, generating ioread
accesses to BAR0, without knowing the memory-enable state of the device
in the command register.  Is there anything special about this device
relative to BAR0 accesses regardless of the memory-enable bit that
allows us to ignore that?

If not, do we need to test before wait_device_ready, such as:

	if (vdev->pm_runtime_engaged ||	!__vfio_pci_memory_enabled(vdev))
		return -EIO;

This opens up a small can of worms though that vfio-pci allows
read/write access regardless of pm_runtime_engaged by waking the device
around such accesses.  This driver doesn't currently participate in
runtime PM beyond the vfio-pci-core code.  Do we need to add runtime PM
wrappers in its read/write handlers and a separate wrapper here that
drops the pm_runtime_engaged test?

There's a comment in the driver indicating the device is tolerate of
certain accesses, independent of the memory-enable bit, so I don't know
how much is actually required here.  Thanks,

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ