[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<SA1PR12MB71992D35824674EB3A0C7D4AB0F1A@SA1PR12MB7199.namprd12.prod.outlook.com>
Date: Fri, 24 Oct 2025 11:59:09 +0000
From: Ankit Agrawal <ankita@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>, Shuai Xue <xueshuai@...ux.alibaba.com>
CC: Aniket Agashe <aniketa@...dia.com>, Vikram Sethi <vsethi@...dia.com>, Matt
Ochs <mochs@...dia.com>, Shameer Kolothum <skolothumtho@...dia.com>,
"linmiaohe@...wei.com" <linmiaohe@...wei.com>, "nao.horiguchi@...il.com"
<nao.horiguchi@...il.com>, "akpm@...ux-foundation.org"
<akpm@...ux-foundation.org>, "david@...hat.com" <david@...hat.com>,
"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>,
"Liam.Howlett@...cle.com" <Liam.Howlett@...cle.com>, "vbabka@...e.cz"
<vbabka@...e.cz>, "rppt@...nel.org" <rppt@...nel.org>, "surenb@...gle.com"
<surenb@...gle.com>, "mhocko@...e.com" <mhocko@...e.com>,
"tony.luck@...el.com" <tony.luck@...el.com>, "bp@...en8.de" <bp@...en8.de>,
"rafael@...nel.org" <rafael@...nel.org>, "guohanjun@...wei.com"
<guohanjun@...wei.com>, "mchehab@...nel.org" <mchehab@...nel.org>,
"lenb@...nel.org" <lenb@...nel.org>, "kevin.tian@...el.com"
<kevin.tian@...el.com>, "alex@...zbot.org" <alex@...zbot.org>, Neo Jia
<cjia@...dia.com>, Kirti Wankhede <kwankhede@...dia.com>, "Tarun Gupta
(SW-GPU)" <targupta@...dia.com>, Zhi Wang <zhiw@...dia.com>, Dheeraj Nigam
<dnigam@...dia.com>, Krishnakant Jaju <kjaju@...dia.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>, "linux-edac@...r.kernel.org"
<linux-edac@...r.kernel.org>, "Jonathan.Cameron@...wei.com"
<Jonathan.Cameron@...wei.com>, "ira.weiny@...el.com" <ira.weiny@...el.com>,
"Smita.KoralahalliChannabasappa@....com"
<Smita.KoralahalliChannabasappa@....com>, "u.kleine-koenig@...libre.com"
<u.kleine-koenig@...libre.com>, "peterz@...radead.org"
<peterz@...radead.org>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH v3 1/3] mm: handle poisoning of pfn without struct pages
Thank you so much for the feedbacks! Comments inline.
>> +int register_pfn_address_space(struct pfn_address_space *pfn_space)
>> +{
>> + if (!pfn_space)
>> + return -EINVAL;
>
> Is there a reason callers may be passing NULL?
No, that would be an invalid use case for such.
> Register and unregister are good places to use guard().
Thanks for the suggestion Ira. Will update it.
>If the pfn is not in the process why is it added to the kill list?
I kept it as it is still a process with VMA mapped to the problematic
PFN. This would ultimately result in SIGKILL to be sent to the process.
in kill_procs(). This is very similar to the __add_to_kill() implementation.
>> +static int memory_failure_pfn(unsigned long pfn, int flags)
>> +{
>> + struct interval_tree_node *node;
>> + LIST_HEAD(tokill);
>> +
>> + mutex_lock(&pfn_space_lock);
>
> scoped_guard() Or probably wrap this part in a guarded function.
Ack, thanks.
>> +void unregister_pfn_address_space(struct pfn_address_space *pfn_space)
>> +{
>> + if (!pfn_space)
>> + return;
>> +
>> + mutex_lock(&pfn_space_lock);
>> + interval_tree_remove(&pfn_space->node, &pfn_space_itree);
>
> IIRC removing something not in interval tree will panic kernel. If I
> am not mistaken, should here do something like
> interval_tree_iter_first before interval_tree_remove, to avoid
> driver's ill behavior crash the system?
Thanks Jiaqi for the suggestion. Yeah, I think we should add it.
I'll fix that in the next version.
> If pfn doesn't belong to any address space mapping, it's still
> counted as MF_RECOVERED?
Hi Miaohe, I wasn't sure how should we tag this. It seems you
are suggesting MF_FAILED is more appropriate. I'll address that.
>> p = pfn_to_online_page(pfn);
>> if (!p) {
>> res = arch_memory_failure(pfn, flags);
>
> Can we move above memory_failure_pfn block here? I'm worried
> that too many scenario branches might lead to confusion.
Sure if there isn't any objection, I'll update it.
>> On Fri, Oct 24, 2025 at 05:45:45PM +0800, Shuai Xue wrote:
>> Rather than having MM maintain metadata about these PFNs, have you
>> considered adding an operation callback similar to
>> dev_pagemap_ops->memory_failure?
>
> I think someone could come with such a proposal on top of this, it
> would not be hard to add some ops to pfn_address_space and have the
> code call them instead of using the address_space.
> This version just needs to link into the existing VMA machinery (ie
> collect_procs_pfn), it doesn't make alot of sense to push that work
> into drivers.
Thanks Shuai for the suggestion. However, I agree on this with Jason.
It is preferable to keep that as separate.
Powered by blists - more mailing lists