linux-kernel - Re: [PATCH 1/3] mm/hugetlb: Restore failed global reservations to subpool

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20260115111946.4b50c5dbe6c6bd01638e4b16@linux-foundation.org>
Date: Thu, 15 Jan 2026 11:19:46 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Joshua Hahn <joshua.hahnjy@...il.com>
Cc: David Hildenbrand <david@...nel.org>, Muchun Song
 <muchun.song@...ux.dev>, Oscar Salvador <osalvador@...e.de>, Wupeng Ma
 <mawupeng1@...wei.com>, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 kernel-team@...a.com, stable@...r.kernel.org
Subject: Re: [PATCH 1/3] mm/hugetlb: Restore failed global reservations to
 subpool

On Thu, 15 Jan 2026 13:14:35 -0500 Joshua Hahn <joshua.hahnjy@...il.com> wrote:

> Commit a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> fixed an underflow error for hstate->resv_huge_pages caused by
> incorrectly attributing globally requested pages to the subpool's
> reservation.
> 
> Unfortunately, this fix also introduced the opposite problem, which would
> leave spool->used_hpages elevated if the globally requested pages could
> not be acquired. This is because while a subpool's reserve pages only
> accounts for what is requested and allocated from the subpool, its
> "used" counter keeps track of what is consumed in total, both from the
> subpool and globally. Thus, we need to adjust spool->used_hpages in the
> other direction, and make sure that globally requested pages are
> uncharged from the subpool's used counter.
> 
> Each failed allocation attempt increments the used_hpages counter by
> how many pages were requested from the global pool. Ultimately, this
> renders the subpool unusable, as used_hpages approaches the max limit.
> 
> The issue can be reproduced as follows:
> 1. Allocate 4 hugetlb pages
> 2. Create a hugetlb mount with max=4, min=2
> 3. Consume 2 pages globally
> 4. Request 3 pages from the subpool (2 from subpool + 1 from global)
> 	4.1 hugepage_subpool_get_pages(spool, 3) succeeds.
> 		used_hpages += 3
> 	4.2 hugetlb_acct_memory(h, 1) fails: no global pages left
> 		used_hpages -= 2
> 5. Subpool now has used_hpages = 1, despite not being able to
>    successfully allocate any hugepages. It believes it can now only
>    allocate 3 more hugepages, not 4.
> 
> Repeating this process will ultimately render the subpool unable to
> allocate any hugepages, since it believes that it is using the maximum
> number of hugepages that the subpool has been allotted.
> 
> The underflow issue that commit a833a693a490 fixes still remains fixed
> as well.

Thanks, I submitted the above to the Changelog Of The Year judging
committee.

> Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
> Cc: stable@...r.kernel.org

I'll add this to mm.git's mm-hotfixes queue, for testing and review
input.

> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6560,6 +6560,7 @@ long hugetlb_reserve_pages(struct inode *inode,
>  	struct resv_map *resv_map;
>  	struct hugetlb_cgroup *h_cg = NULL;
>  	long gbl_reserve, regions_needed = 0;
> +	unsigned long flags;

This could have been local to the {block} which uses it, which would be
nicer, no?

>  	int err;
>  
>  	/* This should never happen */
> @@ -6704,6 +6705,13 @@ long hugetlb_reserve_pages(struct inode *inode,
>  		 */
>  		hugetlb_acct_memory(h, -gbl_resv);
>  	}
> +	/* Restore used_hpages for pages that failed global reservation */
> +	if (gbl_reserve && spool) {
> +		spin_lock_irqsave(&spool->lock, flags);
> +		if (spool->max_hpages != -1)
> +			spool->used_hpages -= gbl_reserve;
> +		unlock_or_release_subpool(spool, flags);
> +	}

I'll add [2/3] and [3/3] to the mm-new queue while discarding your
perfectly good [0/N] :(

Please, let's try not to mix backportable patches with the
non-backportable ones?