[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260114144824.69960-1-boudewijn@delta-utec.com>
Date: Wed, 14 Jan 2026 15:48:23 +0100
From: Boudewijn van der Heide <boudewijn@...ta-utec.com>
To: ziy@...dia.com
Cc: akpm@...ux-foundation.org,
boudewijn@...ta-utec.com,
hannes@...xchg.org,
jackmanb@...gle.com,
linmiaohe@...wei.com,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
mhocko@...e.com,
nao.horiguchi@...il.com,
osalvador@...e.de,
surenb@...gle.com,
vbabka@...e.cz
Subject: Re: [PATCH] mm/page_alloc: Fix freeing of failed-split poisoned compound pages
> > free_pages_prepare() only handles poisoned order-0 pages.
> > In memory_failure() (hard offline), pages
> > are poisoned before attempting to split huge pages. If the split fails,
> > the page remains a compound (order > 0) but is already poisoned. However,
> > Soft-offline pages are always poisoned as order-0 after migration, so
> > they are unaffected.
> >
> > The '!order' check causes these poisoned compound pages to skip
> > poison handling, leaving them in the buddy allocator.
> >
> > Worst case, a poisoned compound page could be reallocated,
> > potentially leading to crashes, silent data corruption,
> > or unwanted memory containment actions before the poison bit is detected.
> >
> > This patch removes the '&& !order' restriction. Cleanup functions in the
> > poison-handling block correctly handle non-zero order pages, making
> > this change safe.
> This is not a fix. IIUC, for >0 order free pages, memory failure uses
> take_page_off_buddy() in a different code path.
>
Thanks again for the quick response and clarification!
>From my understanding,
you correctly noted that take_page_off_buddy() handles already-free pages,
removing them from the buddy lists and setting SetPageHWPoisonTakenOff().
This prevents those pages from re-entering the buddy allocator.
My concern is about in-use THP-backed compound pages:
1. A compound page is in use.
2. memory_failure() marks it poisoned (TestSetPageHWPoison).
3. try_to_split_thp_page() fails.
4. The process using the THP may be killed;
the page remains compound and poisoned.
5. Later, when the page is finally freed, it reaches free_pages_prepare();
'take_page_off_buddy()' is not invoked in this path.
At this point, the current check:
'if (unlikely(PageHWPoison(page)) && !order)'
will not trigger, because the order > 0.
> Miaohe (cc’d) should be able to elaborate more on it.
Thanks for Cc'ing Miaohe, hopefully Miaohe can provide some more insights!
Thanks,
Boudewijn
Powered by blists - more mailing lists