lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7243ce2-2e32-4bc9-8a00-9e69d839d240@lucifer.local>
Date: Wed, 15 Oct 2025 15:25:59 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Zi Yan <ziy@...dia.com>
Cc: linmiaohe@...wei.com, david@...hat.com, jane.chu@...cle.com,
        kernel@...kajraghav.com,
        syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com,
        syzkaller-bugs@...glegroups.com, akpm@...ux-foundation.org,
        mcgrof@...nel.org, nao.horiguchi@...il.com,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
        Lance Yang <lance.yang@...ux.dev>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH 1/2] mm/huge_memory: do not change split_huge_page*()
 target order silently.

On Fri, Oct 10, 2025 at 01:39:05PM -0400, Zi Yan wrote:
> Page cache folios from a file system that support large block size (LBS)
> can have minimal folio order greater than 0, thus a high order folio might
> not be able to be split down to order-0. Commit e220917fa507 ("mm: split a
> folio in minimum folio order chunks") bumps the target order of
> split_huge_page*() to the minimum allowed order when splitting a LBS folio.
> This causes confusion for some split_huge_page*() callers like memory
> failure handling code, since they expect after-split folios all have
> order-0 when split succeeds but in really get min_order_for_split() order
> folios.
>
> Fix it by failing a split if the folio cannot be split to the target order.
>
> Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks")
> [The test poisons LBS folios, which cannot be split to order-0 folios, and
> also tries to poison all memory. The non split LBS folios take more memory
> than the test anticipated, leading to OOM. The patch fixed the kernel
> warning and the test needs some change to avoid OOM.]
> Reported-by: syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com
> Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/
> Signed-off-by: Zi Yan <ziy@...dia.com>

Generally ok with the patch in general but a bunch of comments below!

> ---
>  include/linux/huge_mm.h | 28 +++++-----------------------
>  mm/huge_memory.c        |  9 +--------
>  mm/truncate.c           |  6 ++++--
>  3 files changed, 10 insertions(+), 33 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 8eec7a2a977b..9950cda1526a 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -394,34 +394,16 @@ static inline int split_huge_page_to_list_to_order(struct page *page, struct lis
>   * Return: 0: split is successful, otherwise split failed.
>   */

You need to update the kdoc too.

Also can you mention there this is the function you should use if you want
to specify an order?

Maybe we should rename this function to try_folio_split_to_order() to make
that completely explicit now that we're making other splitting logic always
split to order-0?

>  static inline int try_folio_split(struct folio *folio, struct page *page,
> -		struct list_head *list)
> +		struct list_head *list, unsigned int order)

Is this target order? I see non_uniform_split_supported() calls this
new_order so maybe let's use the same naming so as not to confuse it with
the current folio order?

Also - nitty one, but should we put the order as 3rd arg rather than 4th?

As it seems it's normal to pass NULL list, and it's a bit weird to see a
NULL in the middle of the args.

>  {
> -	int ret = min_order_for_split(folio);
> -
> -	if (ret < 0)
> -		return ret;

OK so the point of removing this is that we assume in truncate (the only
user) that we already have this information (i.e. from
mapping_min_folio_order()) right?

> -
> -	if (!non_uniform_split_supported(folio, 0, false))
> +	if (!non_uniform_split_supported(folio, order, false))

While we're here can we make the mystery meat last param commented like:

	if (!non_uniform_split_supported(folio, order, /* warns= */false))

>  		return split_huge_page_to_list_to_order(&folio->page, list,
> -				ret);
> -	return folio_split(folio, ret, page, list);
> +				order);
> +	return folio_split(folio, order, page, list);
>  }
>  static inline int split_huge_page(struct page *page)
>  {
> -	struct folio *folio = page_folio(page);
> -	int ret = min_order_for_split(folio);
> -
> -	if (ret < 0)
> -		return ret;
> -
> -	/*
> -	 * split_huge_page() locks the page before splitting and
> -	 * expects the same page that has been split to be locked when
> -	 * returned. split_folio(page_folio(page)) cannot be used here
> -	 * because it converts the page to folio and passes the head
> -	 * page to be split.
> -	 */
> -	return split_huge_page_to_list_to_order(page, NULL, ret);
> +	return split_huge_page_to_list_to_order(page, NULL, 0);

OK so the idea here is that callers would expect to split to 0 and the
specific instance where we would actually want this behaviour of splittnig
to a minimum order is now limited only to try_folio_split() (or
try_folio_split_to_order() if you rename)?

>  }
>  void deferred_split_folio(struct folio *folio, bool partially_mapped);
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 0fb4af604657..af06ee6d2206 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3829,8 +3829,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>
>  		min_order = mapping_min_folio_order(folio->mapping);
>  		if (new_order < min_order) {
> -			VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u",
> -				     min_order);

Why are we dropping this?

>  			ret = -EINVAL;
>  			goto out;
>  		}

> @@ -4173,12 +4171,7 @@ int min_order_for_split(struct folio *folio)
>
>  int split_folio_to_list(struct folio *folio, struct list_head *list)
>  {
> -	int ret = min_order_for_split(folio);
> -
> -	if (ret < 0)
> -		return ret;
> -
> -	return split_huge_page_to_list_to_order(&folio->page, list, ret);
> +	return split_huge_page_to_list_to_order(&folio->page, list, 0);
>  }
>
>  /*
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 91eb92a5ce4f..1c15149ae8e9 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -194,6 +194,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  	size_t size = folio_size(folio);
>  	unsigned int offset, length;
>  	struct page *split_at, *split_at2;
> +	unsigned int min_order;
>
>  	if (pos < start)
>  		offset = start - pos;
> @@ -223,8 +224,9 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  	if (!folio_test_large(folio))
>  		return true;
>
> +	min_order = mapping_min_folio_order(folio->mapping);
>  	split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE);
> -	if (!try_folio_split(folio, split_at, NULL)) {
> +	if (!try_folio_split(folio, split_at, NULL, min_order)) {
>  		/*
>  		 * try to split at offset + length to make sure folios within
>  		 * the range can be dropped, especially to avoid memory waste
> @@ -254,7 +256,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  		 */
>  		if (folio_test_large(folio2) &&
>  		    folio2->mapping == folio->mapping)
> -			try_folio_split(folio2, split_at2, NULL);
> +			try_folio_split(folio2, split_at2, NULL, min_order);
>
>  		folio_unlock(folio2);
>  out:
> --
> 2.51.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ