[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aULbwOHkRvWwy6zg@cmpxchg.org>
Date: Wed, 17 Dec 2025 11:35:12 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Brendan Jackman <jackmanb@...gle.com>, Zi Yan <ziy@...dia.com>,
David Rientjes <rientjes@...gle.com>,
David Hildenbrand <david@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Mike Rapoport <rppt@...nel.org>,
Joshua Hahn <joshua.hahnjy@...il.com>,
Pedro Falcato <pfalcato@...e.de>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 2/2] mm, page_alloc: fail costly __GFP_NORETRY
allocations faster
On Wed, Dec 17, 2025 at 09:46:34AM +0100, Vlastimil Babka wrote:
> On 12/16/25 21:32, Johannes Weiner wrote:
> > On Tue, Dec 16, 2025 at 04:54:22PM +0100, Vlastimil Babka wrote:
> >> It might make therefore more sense to just fail unconditionally after
> >> the initial compaction attempt, so do that instead. Costly allocations
> >> that do want the reclaim/compaction to happen at least once can omit
> >> __GFP_NORETRY, or even specify __GFP_RETRY_MAYFAIL for more than one
> >> attempt.
> >>
> >> There is a slight potential unfairness in that costly __GFP_NORETRY
> >> allocations that can't perform direct compaction (i.e. lack __GFP_IO)
> >> will still be allowed to direct reclaim, while those that can direct
> >> compact will now never attempt direct reclaim. However, in cases of
> >> memory pressure causing compaction to be skipped due to insufficient
> >> base pages, direct reclaim was already not done before, so there should
> >> be no functional regressions from this change.
> >
> > Hm, kind of. There could be enough basepages for compaction_suitable()
> > but compaction odds are still higher with more free pages. So there
> > might be cases it regresses.
> >
> > __GFP_NORETRY semantics say it'll try reclaim at least once. We should
> > be able to keep that and still simplify, no?
> >
> >> if (costly_order && (gfp_mask & __GFP_NORETRY)) {
> >> - if (gfp_mask & __GFP_THISNODE)
> >> - goto nopage;
> >> + goto nopage;
> >
> > IOW, maybe directly select for the NUMA-THP special case here?
> >
> > /* Optimistic node-local huge page - only compact once */
> > if (costly_order &&
> > ((gfp_mask & (__GFP_NORETRY|__GFP_THISNODE)) ==
> > (__GFP_NORETRY|__GFP_THISNODE)))
> > goto nopage;
> >
> > and then let other __GFP_NORETRY fall through.
>
> I did consider it as an alternative when realizing the potential unfairness
> mentioned above, but then went with the simpler code option.
>
> With your suggestion we keep the THP-specific check but at least remove the
> arguably illogical compaction feedback.
Yes, I'm in favor of removing those either way.
Reclaim makes its own decisions around costly orders. For example, it
targets a higher number of free pages through compaction_ready() than
where compaction would return SKIPPED, to account for concurrency. I
don't think the allocator should have conflicting opinions.
Regarding __GFP_NORETRY: I think it would just be a chance to simplify
the mental model around it again. If somebody does a NORETRY request
when memory is full of stale page cache, I think it's reasonable to
expect at least one shot at dropping some cache to make it happen.
Shortcutting directly to compaction is a good optimization when we
suspect it could succeed without requiring reclaim. But I'm not sure
it's reasonable to ONLY do that and give up.
Btw, I do wonder why that up-front compaction run is so explicit, when
we have
__alloc_pages_direct_reclaim()
__alloc_pages_direct_compact()
calls following below. Couldn't we check for conditions upfront and
set a flag to skip reclaim initially? Then handle priority adjustments
in the retry conditions? IOW, something like:
unsigned long did_some_progress = 0;
if (can_compact && costly_order)
skip_reclaim = true;
if (can_compact && order > 0 && ac->migratetype != MIGRATE_MOVABLE)
skip_reclaim = true;
if (gfp_thisnode_noretry(gfp_mask))
skip_reclaim = true;
retry:
page = get_page_from_freelist(..., alloc_flags, ...);
if (page)
goto got_pg;
if (!skip_reclaim) {
page = __alloc_pages_direct_reclaim(..., &did_some_progress);
if (page)
goto got_pg;
}
page = __alloc_pages_direct_compact(...);
if (page)
goto got_pg;
if (should_loop()) {
skip_reclaim = false;
compact_priority = ...;
goto retry;
}
That would naturally get rid of the gfp_pfmemalloc_allowed() branch
for the upfront check as well, because the ALLOC_NO_WATERMARKS attempt
happens before we do the reclaim/compaction calls.
Powered by blists - more mailing lists