lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260115194511.836766-1-joshua.hahnjy@gmail.com>
Date: Thu, 15 Jan 2026 14:45:10 -0500
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: David Hildenbrand <david@...nel.org>,
	Muchun Song <muchun.song@...ux.dev>,
	Oscar Salvador <osalvador@...e.de>,
	Wupeng Ma <mawupeng1@...wei.com>,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	kernel-team@...a.com,
	stable@...r.kernel.org
Subject: Re: [PATCH 1/3] mm/hugetlb: Restore failed global reservations to subpool

On Thu, 15 Jan 2026 11:19:46 -0800 Andrew Morton <akpm@...ux-foundation.org> wrote:

> On Thu, 15 Jan 2026 13:14:35 -0500 Joshua Hahn <joshua.hahnjy@...il.com> wrote:

Hello Andrew, I hope you are doing well. Thank you for your help as always!

> > Commit a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> > fixed an underflow error for hstate->resv_huge_pages caused by
> > incorrectly attributing globally requested pages to the subpool's
> > reservation.
> > 
> > Unfortunately, this fix also introduced the opposite problem, which would
> > leave spool->used_hpages elevated if the globally requested pages could
> > not be acquired. This is because while a subpool's reserve pages only
> > accounts for what is requested and allocated from the subpool, its
> > "used" counter keeps track of what is consumed in total, both from the
> > subpool and globally. Thus, we need to adjust spool->used_hpages in the
> > other direction, and make sure that globally requested pages are
> > uncharged from the subpool's used counter.
> > 
> > Each failed allocation attempt increments the used_hpages counter by
> > how many pages were requested from the global pool. Ultimately, this
> > renders the subpool unusable, as used_hpages approaches the max limit.
> > 
> > The issue can be reproduced as follows:
> > 1. Allocate 4 hugetlb pages
> > 2. Create a hugetlb mount with max=4, min=2
> > 3. Consume 2 pages globally
> > 4. Request 3 pages from the subpool (2 from subpool + 1 from global)
> > 	4.1 hugepage_subpool_get_pages(spool, 3) succeeds.
> > 		used_hpages += 3
> > 	4.2 hugetlb_acct_memory(h, 1) fails: no global pages left
> > 		used_hpages -= 2
> > 5. Subpool now has used_hpages = 1, despite not being able to
> >    successfully allocate any hugepages. It believes it can now only
> >    allocate 3 more hugepages, not 4.
> > 
> > Repeating this process will ultimately render the subpool unable to
> > allocate any hugepages, since it believes that it is using the maximum
> > number of hugepages that the subpool has been allotted.
> > 
> > The underflow issue that commit a833a693a490 fixes still remains fixed
> > as well.
> 
> Thanks, I submitted the above to the Changelog Of The Year judging
> committee.

: -) Thank you for the kind words!

> > Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
> > Cc: stable@...r.kernel.org
> 
> I'll add this to mm.git's mm-hotfixes queue, for testing and review
> input.

Sounds good to me! I'll wait a bit in case others have different concerns,
but I'll send out a new version which addresses your comments below (and
any future comments) in a day or two.

> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -6560,6 +6560,7 @@ long hugetlb_reserve_pages(struct inode *inode,
> >  	struct resv_map *resv_map;
> >  	struct hugetlb_cgroup *h_cg = NULL;
> >  	long gbl_reserve, regions_needed = 0;
> > +	unsigned long flags;
> 
> This could have been local to the {block} which uses it, which would be
> nicer, no?

Definiely, I'll address this in v2!

> >  	int err;
> >  
> >  	/* This should never happen */
> > @@ -6704,6 +6705,13 @@ long hugetlb_reserve_pages(struct inode *inode,
> >  		 */
> >  		hugetlb_acct_memory(h, -gbl_resv);
> >  	}
> > +	/* Restore used_hpages for pages that failed global reservation */
> > +	if (gbl_reserve && spool) {
> > +		spin_lock_irqsave(&spool->lock, flags);
> > +		if (spool->max_hpages != -1)
> > +			spool->used_hpages -= gbl_reserve;
> > +		unlock_or_release_subpool(spool, flags);
> > +	}
> 
> I'll add [2/3] and [3/3] to the mm-new queue while discarding your
> perfectly good [0/N] :(
> 
> Please, let's try not to mix backportable patches with the
> non-backportable ones?

Oh no! Sorry, this is my first time Cc-ing stable so I wasn't aware of the
implications. In v2, I'll send the two out as separate patches, so that it's
easier to backport. I was just eager to send out 2/3 and 3/3 because I've
been waiting for a functional hugetlb patch to smoosh these cleanups into.

I'll be more mindful in the future.

Thank you again, I hope you have a great day!!
Joshua

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ