[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5046fe72-4e1c-4ed9-a970-af4b28e54ba8@oracle.com>
Date: Mon, 22 Dec 2025 12:29:44 -0800
From: jane.chu@...cle.com
To: Miaohe Lin <linmiaohe@...wei.com>
Cc: muchun.song@...ux.dev, osalvador@...e.de, david@...nel.org,
jiaqiyan@...gle.com, william.roche@...cle.com, rientjes@...gle.com,
akpm@...ux-foundation.org, lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com, rppt@...nel.org, surenb@...gle.com,
mhocko@...e.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept
hugetlb tail page pfn
On 12/21/2025 7:01 PM, Miaohe Lin wrote:
> On 2025/12/19 16:06, jane.chu@...cle.com wrote:
>>
>>
>> On 12/19/2025 12:01 AM, Miaohe Lin wrote:
>>> On 2025/12/19 14:28, Jane Chu wrote:
>>>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>>>> passed head pfn to kill_accessing_process(), that is not right.
>>>> The precise pfn of the poisoned page should be used in order to
>>>> determine the precise vaddr as the SIGBUS payload.
>>>>
>>>> This issue has already been taken care of in the normal path, that is,
>>>> hwpoison_user_mappings(), see [1][2]. Further more, for [3] to work
>>>> correctly in the hugetlb repoisoning case, it's essential to inform
>>>> VM the precise poisoned page, not the head page.
>>>>
>>>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>>>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>>>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>>>
>>>
>>> Thanks for your patch.
>>>
>>>> Cc: <stable@...r.kernel.org>
>>>> Signed-off-by: Jane Chu <jane.chu@...cle.com>
>>>> ---
>>>> mm/memory-failure.c | 22 ++++++++++++----------
>>>> 1 file changed, 12 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 3edebb0cda30..c9d87811b1ea 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>>> }
>>>> static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>> - unsigned long poisoned_pfn, struct to_kill *tk)
>>>> + unsigned long poisoned_pfn, struct to_kill *tk,
>>>> + int pte_nr)
>>>> {
>>>> unsigned long pfn = 0;
>>>> + unsigned long hwpoison_vaddr;
>>>> if (pte_present(pte)) {
>>>> pfn = pte_pfn(pte);
>>>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>> pfn = swp_offset_pfn(swp);
>>>> }
>>>> - if (!pfn || pfn != poisoned_pfn)
>>>> + if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>>> return 0;
>>>
>>> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?
>>
>> Why? Is there any concern with using the macro pages_per_huge_page(h) ?
>
> No, I was trying to get rid of new @pte_nr parameter. Something like below:
>
> static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> - unsigned long poisoned_pfn, struct to_kill *tk,
> - int pte_nr)
> + unsigned long poisoned_pfn, struct to_kill *tk)
> {
> unsigned long pfn = 0;
> unsigned long hwpoison_vaddr;
> + int pte_nr;
>
> if (pte_present(pte)) {
> pfn = pte_pfn(pte);
> @@ -701,7 +701,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> pfn = softleaf_to_pfn(entry);
> }
>
> - if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
> + pte_nr = 1UL << (shift - PAGE_SHIFT);
> + if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
> return 0;
>
> hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
>
> So we don't have to pass in pte_nr from all callers. But that's trivial.
Got it, that's better. I will combine yours and Matthew's suggestion in v3.
Thanks a lot!
-jane
>
> Thanks.
> .
>
Powered by blists - more mailing lists