[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <350b69d3-0680-41c9-8b3e-578504894372@lucifer.local>
Date: Thu, 16 Oct 2025 09:34:16 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Zi Yan <ziy@...dia.com>
Cc: linmiaohe@...wei.com, david@...hat.com, jane.chu@...cle.com,
kernel@...kajraghav.com,
syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com,
syzkaller-bugs@...glegroups.com, akpm@...ux-foundation.org,
mcgrof@...nel.org, nao.horiguchi@...il.com,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
Lance Yang <lance.yang@...ux.dev>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH 1/2] mm/huge_memory: do not change split_huge_page*()
target order silently.
On Wed, Oct 15, 2025 at 06:57:37PM -0400, Zi Yan wrote:
> On 15 Oct 2025, at 10:25, Lorenzo Stoakes wrote:
>
> > On Fri, Oct 10, 2025 at 01:39:05PM -0400, Zi Yan wrote:
> >> Page cache folios from a file system that support large block size (LBS)
> >> can have minimal folio order greater than 0, thus a high order folio might
> >> not be able to be split down to order-0. Commit e220917fa507 ("mm: split a
> >> folio in minimum folio order chunks") bumps the target order of
> >> split_huge_page*() to the minimum allowed order when splitting a LBS folio.
> >> This causes confusion for some split_huge_page*() callers like memory
> >> failure handling code, since they expect after-split folios all have
> >> order-0 when split succeeds but in really get min_order_for_split() order
> >> folios.
> >>
> >> Fix it by failing a split if the folio cannot be split to the target order.
> >>
> >> Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks")
> >> [The test poisons LBS folios, which cannot be split to order-0 folios, and
> >> also tries to poison all memory. The non split LBS folios take more memory
> >> than the test anticipated, leading to OOM. The patch fixed the kernel
> >> warning and the test needs some change to avoid OOM.]
> >> Reported-by: syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com
> >> Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/
> >> Signed-off-by: Zi Yan <ziy@...dia.com>
> >
> > Generally ok with the patch in general but a bunch of comments below!
> >
> >> ---
> >> include/linux/huge_mm.h | 28 +++++-----------------------
> >> mm/huge_memory.c | 9 +--------
> >> mm/truncate.c | 6 ++++--
> >> 3 files changed, 10 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> >> index 8eec7a2a977b..9950cda1526a 100644
> >> --- a/include/linux/huge_mm.h
> >> +++ b/include/linux/huge_mm.h
> >> @@ -394,34 +394,16 @@ static inline int split_huge_page_to_list_to_order(struct page *page, struct lis
> >> * Return: 0: split is successful, otherwise split failed.
> >> */
> >
> > You need to update the kdoc too.
>
> Done it locally.
Thanks!
>
> >
> > Also can you mention there this is the function you should use if you want
> > to specify an order?
>
> You mean min_order_for_split()? Sure.
No, I mean try_folio_split_to_order() :)
But ofc this applies to min_order_for_split() also
>
> >
> > Maybe we should rename this function to try_folio_split_to_order() to make
> > that completely explicit now that we're making other splitting logic always
> > split to order-0?
>
> Sure.
Thanks
> >
> >> static inline int try_folio_split(struct folio *folio, struct page *page,
> >> - struct list_head *list)
> >> + struct list_head *list, unsigned int order)
> >
> > Is this target order? I see non_uniform_split_supported() calls this
> > new_order so maybe let's use the same naming so as not to confuse it with
> > the current folio order?
>
> Sure, will rename it to new_order.
Thanks
>
> >
> > Also - nitty one, but should we put the order as 3rd arg rather than 4th?
> >
> > As it seems it's normal to pass NULL list, and it's a bit weird to see a
> > NULL in the middle of the args.
>
> OK, will reorder the args.
Thanks
>
> >
> >> {
> >> - int ret = min_order_for_split(folio);
> >> -
> >> - if (ret < 0)
> >> - return ret;
> >
> > OK so the point of removing this is that we assume in truncate (the only
> > user) that we already have this information (i.e. from
> > mapping_min_folio_order()) right?
>
> Right.
>
> >
> >> -
> >> - if (!non_uniform_split_supported(folio, 0, false))
> >> + if (!non_uniform_split_supported(folio, order, false))
> >
> > While we're here can we make the mystery meat last param commented like:
> >
> > if (!non_uniform_split_supported(folio, order, /* warns= */false))
>
> Sure.
Thanks
>
> >
> >> return split_huge_page_to_list_to_order(&folio->page, list,
> >> - ret);
> >> - return folio_split(folio, ret, page, list);
> >> + order);
> >> + return folio_split(folio, order, page, list);
> >> }
> >> static inline int split_huge_page(struct page *page)
> >> {
> >> - struct folio *folio = page_folio(page);
> >> - int ret = min_order_for_split(folio);
> >> -
> >> - if (ret < 0)
> >> - return ret;
> >> -
> >> - /*
> >> - * split_huge_page() locks the page before splitting and
> >> - * expects the same page that has been split to be locked when
> >> - * returned. split_folio(page_folio(page)) cannot be used here
> >> - * because it converts the page to folio and passes the head
> >> - * page to be split.
> >> - */
> >> - return split_huge_page_to_list_to_order(page, NULL, ret);
> >> + return split_huge_page_to_list_to_order(page, NULL, 0);
> >
> > OK so the idea here is that callers would expect to split to 0 and the
> > specific instance where we would actually want this behaviour of splittnig
> > to a minimum order is now limited only to try_folio_split() (or
> > try_folio_split_to_order() if you rename)?
> >
>
> Before commit e220917fa507 (the one to be fixed), split_huge_page() always
> splits @page to order 0. It is just restoring the original behavior.
> If caller wants to split a different order, they should use
> split_huge_page_to_list_to_order() (current no such user except debugfs test
> code).
Yeah makes sense, though now they can also use try_folio_split_to_order() of
course!
>
> >> }
> >> void deferred_split_folio(struct folio *folio, bool partially_mapped);
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 0fb4af604657..af06ee6d2206 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -3829,8 +3829,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> >>
> >> min_order = mapping_min_folio_order(folio->mapping);
> >> if (new_order < min_order) {
> >> - VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u",
> >> - min_order);
> >
> > Why are we dropping this?
>
> This is used to catch “misuse” of split_huge_page_to_list_to_order(), when
> caller wants to split a LBS folio to an order smaller than
> mapping_min_folio_order(). It is based on the assumption that split code
> should never fail on a LBS folio. But that assumption is causing problems
> like the reported memory failure one. So it is removed to allow split code
> to fail without a warning if a LBS folio cannot be split to the new_order.
OK fair, we shouldn't be warning if this is something that can actually
reasonably happen.
Cheers, Lorenzo
Powered by blists - more mailing lists