[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260119133805.49fa7b8d@shazbot.org>
Date: Mon, 19 Jan 2026 13:38:05 -0700
From: Alex Williamson <alex@...zbot.org>
To: <ankita@...dia.com>
Cc: <vsethi@...dia.com>, <jgg@...dia.com>, <mochs@...dia.com>,
<jgg@...pe.ca>, <skolothumtho@...dia.com>, <linmiaohe@...wei.com>,
<nao.horiguchi@...il.com>, <cjia@...dia.com>, <zhiw@...dia.com>,
<kjaju@...dia.com>, <yishaih@...dia.com>, <kevin.tian@...el.com>,
<kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH v2 0/2] Register device memory for poison handling
On Thu, 15 Jan 2026 20:28:47 +0000
<ankita@...dia.com> wrote:
> From: Ankit Agrawal <ankita@...dia.com>
>
> Linux MM provides interfaces to allow a driver to [un]register device
> memory not backed by struct page for poison handling through
> memory_failure.
>
> The device memory on NVIDIA Grace based systems are not added to the
> kernel and are not backed by struct pages. So nvgrace-gpu module
> which manages the device memory can make use of these interfaces to
> get the benefit of poison handling. Make nvgrace-gpu register the device
> memory with the MM on open.
>
> Moreover, the stubs are added to accommodate for CONFIG_MEMORY_FAILURE
> being disabled.
>
> Patch 1/2 introduces stubs for CONFIG_MEMORY_FAILURE disabled.
> Patch 2/2 registers the device memory at the time of open instead of mmap.
>
> Note that this is a reposting of an earlier series [1] which is partly
> (patch 1/3) merged to v6.19-rc4. This one addresses the leftover patching.
> Many thanks to Jason Gunthorpe (jgg@...dia.com) and Alex Williamson
> (alex@...zbot.org) for valuable suggestions.
>
> Link: https://lore.kernel.org/all/20251213044708.3610-1-ankita@nvidia.com/ [1]
>
> Changelog:
> v2:
> - Fixed nit to cleanup nvgrace_gpu_vfio_pci_register_pfn_range
> (Thanks Jiaqi Yan)
> Link: https://lore.kernel.org/all/20260108153548.7386-1-ankita@nvidia.com/ [v1]
>
> Ankit Agrawal (2):
> mm: add stubs for PFNMAP memory failure registration functions
> vfio/nvgrace-gpu: register device memory for poison handling
>
> drivers/vfio/pci/nvgrace-gpu/main.c | 113 +++++++++++++++++++++++++++-
> include/linux/memory-failure.h | 13 +++-
> 2 files changed, 120 insertions(+), 6 deletions(-)
>
Applied to vfio next branch for v6.20/7.0. Thanks,
Alex
Powered by blists - more mailing lists