linux-kernel - Re: [PATCH v3 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1c125a8e-650d-46fd-aff5-0cbf81cd8474@lucifer.local>
Date: Thu, 17 Jul 2025 15:31:03 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Zi Yan <ziy@...dia.com>
Cc: Balbir Singh <balbirs@...dia.com>, David Hildenbrand <david@...hat.com>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Hugh Dickins <hughd@...gle.com>,
        Kirill Shutemov <k.shutemov@...il.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 2/2] mm/huge_memory: use folio_expected_ref_count() to
 calculate ref_count.

On Mon, Jul 14, 2025 at 01:18:23PM -0400, Zi Yan wrote:
> Instead of open coding the ref_count calculation, use
> folio_expected_ref_count().

You really should put something here about why it is that the open-coded
value and the value returned from folio_expected_ref_count() would be
expected to be the same. See comment below inline with code.

>
> Suggested-by: David Hildenbrand <david@...hat.com>
> Signed-off-by: Zi Yan <ziy@...dia.com>
> Acked-by: Balbir Singh <balbirs@...dia.com>
> Acked-by: David Hildenbrand <david@...hat.com>

Ah haha you're literally addresing some of my code review here from the
last patch :) I love it when that happens :P

I'd like you to improve the commit message, but that's a nit so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>

See below for some analysis of the folio_expected_ref_count().

> ---
>  mm/huge_memory.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a7ee731f974f..31b5c4e61a57 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3735,6 +3735,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  	if (folio_ref_freeze(folio, 1 + extra_pins)) {
>  		struct address_space *swap_cache = NULL;
>  		struct lruvec *lruvec;
> +		int expected_refs;
>
>  		if (folio_order(folio) > 1 &&
>  		    !list_empty(&folio->_deferred_list)) {
> @@ -3805,11 +3806,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  		     new_folio = next) {
>  			next = folio_next(new_folio);
>
> -			folio_ref_unfreeze(
> -				new_folio,
> -				1 + ((mapping || swap_cache) ?
> -					     folio_nr_pages(new_folio) :
> -					     0));
> +			expected_refs = folio_expected_ref_count(new_folio) + 1;


So digging in:

static inline int folio_expected_ref_count(const struct folio *folio)
{
	const int order = folio_order(folio);
	int ref_count = 0;

	...

	if (folio_test_anon(folio)) {
		/* One reference per page from the swapcache. */
		ref_count += folio_test_swapcache(folio) << order;
	} else {
		/* One reference per page from the pagecache. */
		ref_count += !!folio->mapping << order;

^---- these are covered off by (mapping || swap_cache) ? folio_nr_pages(folio)

		/* One reference from PG_private. */
		ref_count += folio_test_private(folio);

This one is trickier.

OK so looking through the logic, the can_split_folio() function will
already assert that the only pins you have are the swapcache/page cache
ones on the 'origin' folio (the mapcount bit used in the freeze doesn't matter
as you're dealing with split, not-yet-mapped 'sub'-folios).

So this precludes an elevated refcount from PG_private, therefore this will
naturally be 0.

	}

	/* One reference per page table mapping. */
	return ref_count + folio_mapcount(folio);

folio_mapcount() will be zero for these split folios, until remapped.

}

You add the + 1 to account for the folio pin of course.

TL;DR - this is correct AFAICT.


> +			folio_ref_unfreeze(new_folio, expected_refs);
>
>  			lru_add_split_folio(folio, new_folio, lruvec, list);
>
> @@ -3839,8 +3837,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  		 * Otherwise, a parallel folio_try_get() can grab origin_folio
>  		 * and its caller can see stale page cache entries.
>  		 */
> -		folio_ref_unfreeze(folio, 1 +
> -			((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
> +		expected_refs = folio_expected_ref_count(folio) + 1;
> +		folio_ref_unfreeze(folio, expected_refs);
>
>  		unlock_page_lruvec(lruvec);
>
> --
> 2.47.2
>