[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260108153548.7386-1-ankita@nvidia.com>
Date: Thu, 8 Jan 2026 15:35:46 +0000
From: <ankita@...dia.com>
To: <ankita@...dia.com>, <vsethi@...dia.com>, <jgg@...dia.com>,
<mochs@...dia.com>, <jgg@...pe.ca>, <skolothumtho@...dia.com>,
<alex@...zbot.org>, <linmiaohe@...wei.com>, <nao.horiguchi@...il.com>
CC: <cjia@...dia.com>, <zhiw@...dia.com>, <kjaju@...dia.com>,
<yishaih@...dia.com>, <kevin.tian@...el.com>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>
Subject: [PATCH v1 0/2] Register device memory for poison handling
From: Ankit Agrawal <ankita@...dia.com>
Linux MM provides interfaces to allow a driver to [un]register device
memory not backed by struct page for poison handling through
memory_failure.
The device memory on NVIDIA Grace based systems are not added to the
kernel and are not backed by struct pages. So nvgrace-gpu module
which manages the device memory can make use of these interfaces to
get the benefit of poison handling. Make nvgrace-gpu register the device
memory with the MM on open.
Moreover, the stubs are added to accommodate for CONFIG_MEMORY_FAILURE
being disabled.
Patch 1/2 introduces stubs for CONFIG_MEMORY_FAILURE disabled.
Patch 2/2 registers the device memory at the time of open instead of mmap.
Note that this is a reposting of an earlier series [1] which is partly
(patch 1/3) merged to v6.19-rc4. This one addresses the leftover patching.
Many thanks to Jason Gunthorpe (jgg@...dia.com) and Alex Williamson
(alex@...zbot.org) for valuable suggestions.
Link: https://lore.kernel.org/all/20251213044708.3610-1-ankita@nvidia.com/ [1]
Ankit Agrawal (2):
mm: add stubs for PFNMAP memory failure registration functions
vfio/nvgrace-gpu: register device memory for poison handling
drivers/vfio/pci/nvgrace-gpu/main.c | 116 +++++++++++++++++++++++++++-
include/linux/memory-failure.h | 13 +++-
2 files changed, 123 insertions(+), 6 deletions(-)
--
2.34.1
Powered by blists - more mailing lists