lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55542653-8902-46ea-b416-776fc58bd644@oracle.com>
Date: Thu, 18 Dec 2025 15:10:44 -0800
From: jane.chu@...cle.com
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        "David Hildenbrand (Red Hat)" <david@...nel.org>,
        muchun.song@...ux.dev, osalvador@...e.de, linmiaohe@...wei.com,
        jiaqiyan@...gle.com, william.roche@...cle.com, rientjes@...gle.com,
        akpm@...ux-foundation.org, lorenzo.stoakes@...cle.com, rppt@...nel.org,
        surenb@...gle.com, mhocko@...e.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/memory-failure: fix missing ->mf_stats count in
 hugetlb poison

Hi, Liam,

Thanks!  My reply towards the end.

On 12/18/2025 12:26 PM, Liam R. Howlett wrote:
> * jane.chu@...cle.com <jane.chu@...cle.com> [251218 14:01]:
>> Hi, David,
>>
>> Thanks for the review.
>>
>> On 12/18/2025 12:41 AM, David Hildenbrand (Red Hat) wrote:
>>> On 12/16/25 22:56, Jane Chu wrote:
>>>> When a newly poisoned subpage ends up in an already poisoned hugetlb
>>>
>>> The concept of subpages does not exist. It's a page of a hugetlb folio.
>>
>> Okay.
>>
>>>
>>>> folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats
>>>> is not. Fix the inconsistency by designating action_result() to update
>>>> them both.
>>>
>>> What is the user-visible result of that?
>>
>> For the purpose of observation, and potential action afterwards.
>>
>> # cat /proc/meminfo | grep HardwareCorrupted
>> shows 'num_poisoned_pages', the global count of poisoned pages.
>>
>> # ls /sys/devices/system/node/node0/memory_failure
>> delayed  failed  ignored  recovered  total
>> these fields show the per node ->mf_stats, that is the MF handling results.
>>
>>>
>>>>
>>>> Fixes: 18f41fa616ee4 ("mm: memory-failure: bump memory failure stats
>>>> to pglist_data")
>>>> Cc: <stable@...r.kernel.org>
>>>> Signed-off-by: Jane Chu <jane.chu@...cle.com>
>>>> ---
>>>>    include/linux/hugetlb.h |  4 ++--
>>>>    include/linux/mm.h      |  4 ++--
>>>>    mm/hugetlb.c            |  4 ++--
>>>>    mm/memory-failure.c     | 22 +++++++++++++---------
>>>>    4 files changed, 19 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>>>> index 8e63e46b8e1f..2e6690c9df96 100644
>>>> --- a/include/linux/hugetlb.h
>>>> +++ b/include/linux/hugetlb.h
>>>> @@ -157,7 +157,7 @@ long hugetlb_unreserve_pages(struct inode
>>>> *inode, long start, long end,
>>>>    bool folio_isolate_hugetlb(struct folio *folio, struct list_head
>>>> *list);
>>>>    int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb,
>>>> bool unpoison);
>>>>    int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
>>>> -                bool *migratable_cleared);
>>>> +                bool *migratable_cleared, bool *samepg);
>>>>    void folio_putback_hugetlb(struct folio *folio);
>>>>    void move_hugetlb_state(struct folio *old_folio, struct folio
>>>> *new_folio, int reason);
>>>>    void hugetlb_fix_reserve_counts(struct inode *inode);
>>>> @@ -420,7 +420,7 @@ static inline int
>>>> get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb,
>>>>    }
>>>>    static inline int get_huge_page_for_hwpoison(unsigned long pfn,
>>>> int flags,
>>>> -                    bool *migratable_cleared)
>>>> +                    bool *migratable_cleared, bool *samepg)
>>>>    {
>>>>        return 0;
>>>>    }
>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>> index 7c79b3369b82..68b1812e9c0a 100644
>>>> --- a/include/linux/mm.h
>>>> +++ b/include/linux/mm.h
>>>> @@ -4036,7 +4036,7 @@ extern int soft_offline_page(unsigned long
>>>> pfn, int flags);
>>>>    extern const struct attribute_group memory_failure_attr_group;
>>>>    extern void memory_failure_queue(unsigned long pfn, int flags);
>>>>    extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
>>>> -                    bool *migratable_cleared);
>>>> +                    bool *migratable_cleared, bool *samepg);
>>>>    void num_poisoned_pages_inc(unsigned long pfn);
>>>>    void num_poisoned_pages_sub(unsigned long pfn, long i);
>>>>    #else
>>>> @@ -4045,7 +4045,7 @@ static inline void
>>>> memory_failure_queue(unsigned long pfn, int flags)
>>>>    }
>>>>    static inline int __get_huge_page_for_hwpoison(unsigned long pfn,
>>>> int flags,
>>>> -                    bool *migratable_cleared)
>>>> +                    bool *migratable_cleared, bool *samepg)
>>>>    {
>>>>        return 0;
>>>>    }
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index 0455119716ec..f78562a578e5 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -7818,12 +7818,12 @@ int get_hwpoison_hugetlb_folio(struct folio
>>>> *folio, bool *hugetlb, bool unpoison
>>>>    }
>>>>    int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
>>>> -                bool *migratable_cleared)
>>>> +                bool *migratable_cleared, bool *samepg)
>>>>    {
>>>>        int ret;
>>>>        spin_lock_irq(&hugetlb_lock);
>>>> -    ret = __get_huge_page_for_hwpoison(pfn, flags, migratable_cleared);
>>>> +    ret = __get_huge_page_for_hwpoison(pfn, flags,
>>>> migratable_cleared, samepg);
>>>>        spin_unlock_irq(&hugetlb_lock);
>>>>        return ret;
>>>>    }
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 3edebb0cda30..070f43bb110a 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -1873,7 +1873,8 @@ static unsigned long
>>>> __folio_free_raw_hwp(struct folio *folio, bool move_flag)
>>>>        return count;
>>>>    }
>>>> -static int folio_set_hugetlb_hwpoison(struct folio *folio, struct
>>>> page *page)
>>>> +static int folio_set_hugetlb_hwpoison(struct folio *folio, struct
>>>> page *page,
>>>> +                    bool *samepg)
>>>>    {
>>>>        struct llist_head *head;
>>>>        struct raw_hwp_page *raw_hwp;
>>>> @@ -1889,17 +1890,16 @@ static int folio_set_hugetlb_hwpoison(struct
>>>> folio *folio, struct page *page)
>>>>            return -EHWPOISON;
>>>>        head = raw_hwp_list_head(folio);
>>>>        llist_for_each_entry(p, head->first, node) {
>>>> -        if (p->page == page)
>>>> +        if (p->page == page) {
>>>> +            *samepg = true;
>>>>                return -EHWPOISON;
>>>> +        }
>>>>        }
>>>>        raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
>>>>        if (raw_hwp) {
>>>>            raw_hwp->page = page;
>>>>            llist_add(&raw_hwp->node, head);
>>>> -        /* the first error event will be counted in action_result(). */
>>>> -        if (ret)
>>>> -            num_poisoned_pages_inc(page_to_pfn(page));
>>>>        } else {
>>>>            /*
>>>>             * Failed to save raw error info.  We no longer trace all
>>>> @@ -1956,7 +1956,7 @@ void folio_clear_hugetlb_hwpoison(struct folio
>>>> *folio)
>>>>     *   -EHWPOISON    - the hugepage is already hwpoisoned
>>>>     */
>>>>    int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
>>>> -                 bool *migratable_cleared)
>>>> +                 bool *migratable_cleared, bool *samepg)
>>>>    {
>>>>        struct page *page = pfn_to_page(pfn);
>>>>        struct folio *folio = page_folio(page);
>>>> @@ -1981,7 +1981,7 @@ int __get_huge_page_for_hwpoison(unsigned long
>>>> pfn, int flags,
>>>>                goto out;
>>>>        }
>>>> -    if (folio_set_hugetlb_hwpoison(folio, page)) {
>>>> +    if (folio_set_hugetlb_hwpoison(folio, page, samepg)) {
>>>>            ret = -EHWPOISON;
>>>>            goto out;
>>>>        }
>>>> @@ -2014,11 +2014,12 @@ static int
>>>> try_memory_failure_hugetlb(unsigned long pfn, int flags, int
>>>> *hugetlb
>>>>        struct page *p = pfn_to_page(pfn);
>>>>        struct folio *folio;
>>>>        unsigned long page_flags;
>>>> +    bool samepg = false;
>>>>        bool migratable_cleared = false;
>>>>        *hugetlb = 1;
>>>>    retry:
>>>> -    res = get_huge_page_for_hwpoison(pfn, flags, &migratable_cleared);
>>>> +    res = get_huge_page_for_hwpoison(pfn, flags,
>>>> &migratable_cleared, &samepg);
>>>>        if (res == 2) { /* fallback to normal page handling */
>>>>            *hugetlb = 0;
>>>>            return 0;
>>>> @@ -2027,7 +2028,10 @@ static int
>>>> try_memory_failure_hugetlb(unsigned long pfn, int flags, int
>>>> *hugetlb
>>>>                folio = page_folio(p);
>>>>                res = kill_accessing_process(current,
>>>> folio_pfn(folio), flags);
>>>>            }
>>>> -        action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
>>>> +        if (samepg)
>>>> +            action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
>>>> +        else
>>>> +            action_result(pfn, MF_MSG_HUGE, MF_FAILED);
>>>
>>> Can't we somehow return that result from get_huge_page_for_hwpoison()
>>> ... folio_set_hugetlb_hwpoison() differently? E.g., return an enum
>>> instead of "-EHWPOISON" or magic value "2".
>>
>> This is an option. The existing return codes are as follow.
>> __get_huge_page_for_hwpoison():
>>   * Return values:
>>   *   0             - free hugepage
>>   *   1             - in-use hugepage
>>   *   2             - not a hugepage
>>   *   -EBUSY        - the hugepage is busy (try to retry)
>>   *   -EHWPOISON    - the hugepage is already hwpoisoned
>>
>> folio_set_hugetlb_hwpoison()
>> returns
>> 	0:		folio was not poisoned before
>> 	-EHWPOISON:	folio was poisoned before
>>
>> To get rid of 'samepg', how about
>>
>> __get_huge_page_for_hwpoison():
>>   * Return values:
>>   *   0             - free hugepage
>>   *   1             - in-use hugepage
>>   *   2             - not a hugepage
>>   *   3		   - the hugepage is already hwpoisoned in different page
>>   *   4  	   - the hugepage is already hwpoisoned in the same page
>>   *   -EBUSY        - the hugepage is busy (try to retry)
>>
>> folio_set_hugetlb_hwpoison()
>> returns
>> 	0:		folio was not poisoned before
>> 	1:	        folio was poisoned before in different page
>> 	2:	        folio was poisoned before in the same page
>>
>> The whole point about identifying the same page is so that the re-poison
>> event is not doubled counted.
> 
> This means folio_set_hugetlb_hwpoison() returns 0 on success but
> positives on error.. this seems to be going further away from the
> standard way of doing things?

Yes.
 > > It would actually be good to remove all magic values instead of
> expanding them.
> 
> I think what David was trying to say is to have a local enum that states
> what these numbers mean so that the code reads more cleanly, instead of
> digging for the right comment to decode it.
> 
> For example, in try_memory_failure_hugetlb():
> 
> if (res == 2) { /* fallback to normal page handling */
> 
> vs:
> 
> if (res == MEMORY_FAILURE_NOT_HUGEPAGE) { /* fallback to normal page handling */
> 
> You could spell out your other options as well.  Maybe something like
> MEMORY_FAILURE_HWPOISONED_ALREADY_COUNTED
> MEMORY_FAILURE_HWPOISONED
> 
> This would avoid adding more magic values and increase readability.
> 
> If you changed try_memory_failure_hugetlb() to use a switch statement,
> then the compiler can catch unchecked enums for us too.
> 
> If you don't want to go the enum route, then you could use a different
> error code and propagate it through, like -EEXISTS for the new case?
> That way the return is still 0 on success and less than 0 on failure,
> but I think the enum idea has a number of advantages.

I am open, actually prefer enum with switch statement as you suggested 
above.

What about folio_set_hugetlb_hwpoison()?
Indeed the conventional way of folio_set_X_Y() returns only two possible
values, but we need three.
How about changing the function name to set_hugetlb_hwpoison() to 
deviate from the convention?  afterall, the function does more than 
conventional bit setting, it maintains a per folio raw-error linked list
to track the poisoned pages within.

Thanks!
-jane

> 
> Thanks,
> Liam
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ