[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACw3F51qrBXnN370Btk7=bcKU7s44nmQYfN=EAfq25MondRUNA@mail.gmail.com>
Date: Tue, 20 Jan 2026 08:28:38 -0800
From: Jiaqi Yan <jiaqiyan@...gle.com>
To: Ankit Agrawal <ankita@...dia.com>
Cc: Aniket Agashe <aniketa@...dia.com>, Vikram Sethi <vsethi@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>, Matt Ochs <mochs@...dia.com>,
Shameer Kolothum <skolothumtho@...dia.com>, "linmiaohe@...wei.com" <linmiaohe@...wei.com>,
"nao.horiguchi@...il.com" <nao.horiguchi@...il.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "david@...hat.com" <david@...hat.com>,
"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>,
"Liam.Howlett@...cle.com" <Liam.Howlett@...cle.com>, "vbabka@...e.cz" <vbabka@...e.cz>,
"rppt@...nel.org" <rppt@...nel.org>, "surenb@...gle.com" <surenb@...gle.com>, "mhocko@...e.com" <mhocko@...e.com>,
"tony.luck@...el.com" <tony.luck@...el.com>, "bp@...en8.de" <bp@...en8.de>,
"rafael@...nel.org" <rafael@...nel.org>, "guohanjun@...wei.com" <guohanjun@...wei.com>,
"mchehab@...nel.org" <mchehab@...nel.org>, "lenb@...nel.org" <lenb@...nel.org>,
"kevin.tian@...el.com" <kevin.tian@...el.com>, "alex@...zbot.org" <alex@...zbot.org>, Neo Jia <cjia@...dia.com>,
Kirti Wankhede <kwankhede@...dia.com>, "Tarun Gupta (SW-GPU)" <targupta@...dia.com>, Zhi Wang <zhiw@...dia.com>,
Dheeraj Nigam <dnigam@...dia.com>, Krishnakant Jaju <kjaju@...dia.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"Jonathan.Cameron@...wei.com" <Jonathan.Cameron@...wei.com>, "ira.weiny@...el.com" <ira.weiny@...el.com>,
"Smita.KoralahalliChannabasappa@....com" <Smita.KoralahalliChannabasappa@....com>,
"u.kleine-koenig@...libre.com" <u.kleine-koenig@...libre.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Axel Rasmussen <axelrasmussen@...gle.com>
Subject: Re: [PATCH v5 0/3] mm: Implement ECC handling for pfn with no struct page
On Fri, Jan 16, 2026 at 9:36 PM Ankit Agrawal <ankita@...dia.com> wrote:
>
> >>
> >> v2 -> v3
> >> - Rebased to v6.17-rc7.
> >> - Skipped the unmapping of PFNMAP during reception of poison. Suggested by
> >> Jason Gunthorpe, Jiaqi Yan, Vikram Sethi (Thanks!)
> >> - Updated the check to prevent multiple registration to the same PFN
> >> range using interval_tree_iter_first. Thanks Shameer Kolothum for the
> >> suggestion.
> >> - Removed the callback function in the nvgrace-gpu requiring tracking of
> >> poisoned PFN as it isn't required anymore.
> >
> > Hi Ankit,
> >
> >
> > I get that for nvgrace-gpu driver, you removed pfn_address_space_ops
> > because there is no need to unmap poisoned HBM page.
> >
> > What about the nvgrace-egm driver? Now that you removed the
> > pfn_address_space_ops callback from pfn_address_space in [1], how can
> > nvgrace-egm driver know the poisoned EGM pages at runtime?
> >
> > I expect the functionality to return retired pages should also include
> > runtime poisoned pages, which are not in the list queried from
> > egm-retired-pages-data-base during initialization. Or maybe my
> > expection is wrong/obsolete?
>
> Hi Jiaqi, yes the EGM code will include consideration for runtime
> poisoned pages as well. It will now instead make use of the
> pfn_to_vma_pgoff callback merged through https://github.com/torvalds/linux/commit/e6dbcb7c0e7b508d443a9aa6f77f63a2f83b1ae4
Thank you! Sorry I wasn't following that thread closely and missed it.
>
> > [1] https://lore.kernel.org/linux-mm/20230920140210.12663-2-ankita@nvidia.com
> > [2] https://lore.kernel.org/kvm/20250904040828.319452-12-ankita@nvidia.com
>
Powered by blists - more mailing lists