linux-kernel - Re: [PATCH 1/2] mm/huge_memory: do not change split_huge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <350b69d3-0680-41c9-8b3e-578504894372@lucifer.local>
Date: Thu, 16 Oct 2025 09:34:16 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Zi Yan <ziy@...dia.com>
Cc: linmiaohe@...wei.com, david@...hat.com, jane.chu@...cle.com,
        kernel@...kajraghav.com,
        syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com,
        syzkaller-bugs@...glegroups.com, akpm@...ux-foundation.org,
        mcgrof@...nel.org, nao.horiguchi@...il.com,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
        Lance Yang <lance.yang@...ux.dev>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH 1/2] mm/huge_memory: do not change split_huge_page*()
 target order silently.

On Wed, Oct 15, 2025 at 06:57:37PM -0400, Zi Yan wrote:
> On 15 Oct 2025, at 10:25, Lorenzo Stoakes wrote:
>
> > On Fri, Oct 10, 2025 at 01:39:05PM -0400, Zi Yan wrote:
> >> Page cache folios from a file system that support large block size (LBS)
> >> can have minimal folio order greater than 0, thus a high order folio might
> >> not be able to be split down to order-0. Commit e220917fa507 ("mm: split a
> >> folio in minimum folio order chunks") bumps the target order of
> >> split_huge_page*() to the minimum allowed order when splitting a LBS folio.
> >> This causes confusion for some split_huge_page*() callers like memory
> >> failure handling code, since they expect after-split folios all have
> >> order-0 when split succeeds but in really get min_order_for_split() order
> >> folios.
> >>
> >> Fix it by failing a split if the folio cannot be split to the target order.
> >>
> >> Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks")
> >> [The test poisons LBS folios, which cannot be split to order-0 folios, and
> >> also tries to poison all memory. The non split LBS folios take more memory
> >> than the test anticipated, leading to OOM. The patch fixed the kernel
> >> warning and the test needs some change to avoid OOM.]
> >> Reported-by: syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com
> >> Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/
> >> Signed-off-by: Zi Yan <ziy@...dia.com>
> >
> > Generally ok with the patch in general but a bunch of comments below!
> >
> >> ---
> >>  include/linux/huge_mm.h | 28 +++++-----------------------
> >>  mm/huge_memory.c        |  9 +--------
> >>  mm/truncate.c           |  6 ++++--
> >>  3 files changed, 10 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> >> index 8eec7a2a977b..9950cda1526a 100644
> >> --- a/include/linux/huge_mm.h
> >> +++ b/include/linux/huge_mm.h
> >> @@ -394,34 +394,16 @@ static inline int split_huge_page_to_list_to_order(struct page *page, struct lis
> >>   * Return: 0: split is successful, otherwise split failed.
> >>   */
> >
> > You need to update the kdoc too.
>
> Done it locally.

Thanks!

>
> >
> > Also can you mention there this is the function you should use if you want
> > to specify an order?
>
> You mean min_order_for_split()? Sure.

No, I mean try_folio_split_to_order() :)

But ofc this applies to min_order_for_split() also

>
> >
> > Maybe we should rename this function to try_folio_split_to_order() to make
> > that completely explicit now that we're making other splitting logic always
> > split to order-0?
>
> Sure.

Thanks

> >
> >>  static inline int try_folio_split(struct folio *folio, struct page *page,
> >> -		struct list_head *list)
> >> +		struct list_head *list, unsigned int order)
> >
> > Is this target order? I see non_uniform_split_supported() calls this
> > new_order so maybe let's use the same naming so as not to confuse it with
> > the current folio order?
>
> Sure, will rename it to new_order.

Thanks

>
> >
> > Also - nitty one, but should we put the order as 3rd arg rather than 4th?
> >
> > As it seems it's normal to pass NULL list, and it's a bit weird to see a
> > NULL in the middle of the args.
>
> OK, will reorder the args.

Thanks

>
> >
> >>  {
> >> -	int ret = min_order_for_split(folio);
> >> -
> >> -	if (ret < 0)
> >> -		return ret;
> >
> > OK so the point of removing this is that we assume in truncate (the only
> > user) that we already have this information (i.e. from
> > mapping_min_folio_order()) right?
>
> Right.
>
> >
> >> -
> >> -	if (!non_uniform_split_supported(folio, 0, false))
> >> +	if (!non_uniform_split_supported(folio, order, false))
> >
> > While we're here can we make the mystery meat last param commented like:
> >
> > 	if (!non_uniform_split_supported(folio, order, /* warns= */false))
>
> Sure.

Thanks

>
> >
> >>  		return split_huge_page_to_list_to_order(&folio->page, list,
> >> -				ret);
> >> -	return folio_split(folio, ret, page, list);
> >> +				order);
> >> +	return folio_split(folio, order, page, list);
> >>  }
> >>  static inline int split_huge_page(struct page *page)
> >>  {
> >> -	struct folio *folio = page_folio(page);
> >> -	int ret = min_order_for_split(folio);
> >> -
> >> -	if (ret < 0)
> >> -		return ret;
> >> -
> >> -	/*
> >> -	 * split_huge_page() locks the page before splitting and
> >> -	 * expects the same page that has been split to be locked when
> >> -	 * returned. split_folio(page_folio(page)) cannot be used here
> >> -	 * because it converts the page to folio and passes the head
> >> -	 * page to be split.
> >> -	 */
> >> -	return split_huge_page_to_list_to_order(page, NULL, ret);
> >> +	return split_huge_page_to_list_to_order(page, NULL, 0);
> >
> > OK so the idea here is that callers would expect to split to 0 and the
> > specific instance where we would actually want this behaviour of splittnig
> > to a minimum order is now limited only to try_folio_split() (or
> > try_folio_split_to_order() if you rename)?
> >
>
> Before commit e220917fa507 (the one to be fixed), split_huge_page() always
> splits @page to order 0. It is just restoring the original behavior.
> If caller wants to split a different order, they should use
> split_huge_page_to_list_to_order() (current no such user except debugfs test
> code).

Yeah makes sense, though now they can also use try_folio_split_to_order() of
course!

>
> >>  }
> >>  void deferred_split_folio(struct folio *folio, bool partially_mapped);
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 0fb4af604657..af06ee6d2206 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -3829,8 +3829,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> >>
> >>  		min_order = mapping_min_folio_order(folio->mapping);
> >>  		if (new_order < min_order) {
> >> -			VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u",
> >> -				     min_order);
> >
> > Why are we dropping this?
>
> This is used to catch “misuse” of split_huge_page_to_list_to_order(), when
> caller wants to split a LBS folio to an order smaller than
> mapping_min_folio_order(). It is based on the assumption that split code
> should never fail on a LBS folio. But that assumption is causing problems
> like the reported memory failure one. So it is removed to allow split code
> to fail without a warning if a LBS folio cannot be split to the new_order.

OK fair, we shouldn't be warning if this is something that can actually
reasonably happen.

Cheers, Lorenzo