linux-kernel - Re: [PATCH] mm/page_alloc: Clarify some migratetype fallback code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250214212647.GB233399@cmpxchg.org>
Date: Fri, 14 Feb 2025 16:26:47 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Brendan Jackman <jackmanb@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Vlastimil Babka <vbabka@...e.cz>,
	Mel Gorman <mgorman@...hsingularity.net>,
	Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Yosry Ahmed <yosry.ahmed@...ux.dev>
Subject: Re: [PATCH] mm/page_alloc: Clarify some migratetype fallback code

On Fri, Feb 14, 2025 at 06:14:01PM +0000, Brendan Jackman wrote:
> This code is rather confusing because:
> 
>  1. "Steal" is sometimes used to refer to the general concept of
>     allocating from a from a block of a fallback migratetype
>     (steal_suitable_fallback()) but sometimes it refers specifically to
>     converting a whole block's migratetype (can_steal_fallback()).

Yes, that's ambiguous.

>  2. can_steal_fallback() sounds as though it's answering the question "am
>     I functionally permitted to allocate from that other type" but in
>     fact it is encoding a heuristic preference.

Here I don't see that nuance tbh.

>  3. The same piece of data has different names in different places:
>     can_steal vs whole_block. This reinforces point 2 because it looks
>     like the different names reflect a shift in intent from "am I
>     allowed to steal" to "do I want to steal", but no such shift exists.
> 
> Fix 1. by avoiding the term "steal" in ambiguous contexts. This fixes
> 3. automatically since the natural name for can_steal is whole_block.

I'm not a fan of whole_block because it loses the action verb. It
makes sense in the context of steal_suitable_fallback(), but becomes
ominous in find_suitable_fallback().

Maybe @block_claimable would be better?

> Fix 2. by using "should" instead of "can", and also rename its
> parameters and add some commentary to make it more explicit what they
> mean.
> 
> Signed-off-by: Brendan Jackman <jackmanb@...gle.com>
> ---
>  mm/compaction.c |  4 ++--
>  mm/internal.h   |  2 +-
>  mm/page_alloc.c | 42 ++++++++++++++++++++++--------------------
>  3 files changed, 25 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 12ed8425fa175c5dec50bac3dddb13499abaaa11..8dccb2e388f128dd134ec6f24c924c118c9c93bb 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2332,7 +2332,7 @@ static enum compact_result __compact_finished(struct compact_control *cc)
>  	ret = COMPACT_NO_SUITABLE_PAGE;
>  	for (order = cc->order; order < NR_PAGE_ORDERS; order++) {
>  		struct free_area *area = &cc->zone->free_area[order];
> -		bool can_steal;
> +		bool whole_block;
>
>  		/* Job done if page is free of the right migratetype */
>  		if (!free_area_empty(area, migratetype))
> @@ -2349,7 +2349,7 @@ static enum compact_result __compact_finished(struct compact_control *cc)
>  		 * other migratetype buddy lists.
>  		 */
>  		if (find_suitable_fallback(area, order, migratetype,
> -						true, &can_steal) != -1)
> +						true, &whole_block) != -1)

This one e.g. would look clearer with &block_claimable.

Not that it's actually used...

> @@ -1948,7 +1948,7 @@ steal_suitable_fallback(struct zone *zone, struct page *page,
>  	if (boost_watermark(zone) && (alloc_flags & ALLOC_KSWAPD))
>  		set_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);
>  
> -	/* We are not allowed to try stealing from the whole block */
> +	/* No point in stealing from the whole block */

The original comment actually makes more sense to me. Why is there no
point? It could well find enough free+alike pages to steal the
block... It's just not allowed to.

I will say, the current code is pretty hard to reason about:

On one hand we check the block size statically in can_steal_fallback;
on the other hand, we do that majority scan for compatible pages in
steal_suitable_fallback(). The effective outcomes are hard to discern,
and I'm honestly not convinced they're all intentional.

For example, if we're allowed to steal the block because of this in
can_steal_fallback():

	order >= pageblock_order/2

surely, we'll always satisfy this in steal_suitable_fallback()

	free_pages + alike_pages >= (1 << (pageblock_order-1)

on free_pages alone.

And if the order is less than half a block, we're only allowed an
attempt at stealing it if this is true in can_steal_fallback():

	start_type == MIGRATE_RECLAIMABLE ||
	start_type == MIGRATE_UNMOVABLE

So why is the majority scan in steal_suitable_fallback() checking:

	if (start_type == MIGRATE_MOVABLE)
		alike_pages = movable_pages

Here is how I read the effective rules:

- We always steal the block if at least half of it is free.

- If less than half is free, but more than half is compatible (free +
  alike), we currently do movable -> non-movable conversions.

  We don't do non-movable -> movable (won't get to the majority scan).
  This seems reasonable to me, as there seems to be little value in
  making a new pre-polluted movable block.

- However, we do non-movable -> movable conversion if more than half
  is free. This is seemingly in conflict with the previous point.

Then there is compaction, which currently uses only the
find_suitable_fallback() subset of the rules. Namely, for kernel
allocations, compaction stops as soon as there is an adequately sized
fallback. Even if the allocator won't convert the block due to the
majority scan. For movable requests, we'll stop if there is half a
block to fall back to. I suppose that's reasonable - the old
utilization vs. fragmentation debate aside...

Did I miss one?

We should be able to encode all this more concisely.

> @@ -1995,12 +1995,14 @@ steal_suitable_fallback(struct zone *zone, struct page *page,
>  
>  /*
>   * Check whether there is a suitable fallback freepage with requested order.
> - * If only_stealable is true, this function returns fallback_mt only if
> - * we can steal other freepages all together. This would help to reduce
> + * Sets *whole_block to instruct the caller whether it should convert a whole
> + * pageblock to the returned migratetype.
> + * If need_whole_block is true, this function returns fallback_mt only if
> + * we would do this whole-block stealing. This would help to reduce
>   * fragmentation due to mixed migratetype pages in one pageblock.
>   */
>  int find_suitable_fallback(struct free_area *area, unsigned int order,
> -			int migratetype, bool only_stealable, bool *can_steal)
> +			int migratetype, bool need_whole_block, bool *whole_block)
>  {
>  	int i;
>  	int fallback_mt;
> @@ -2008,19 +2010,19 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
>  	if (area->nr_free == 0)
>  		return -1;
>  
> -	*can_steal = false;
> +	*whole_block = false;
>  	for (i = 0; i < MIGRATE_PCPTYPES - 1 ; i++) {
>  		fallback_mt = fallbacks[migratetype][i];
>  		if (free_area_empty(area, fallback_mt))
>  			continue;
>  
> -		if (can_steal_fallback(order, migratetype))
> -			*can_steal = true;
> +		if (should_steal_whole_block(order, migratetype))
> +			*whole_block = true;
>  
> -		if (!only_stealable)
> +		if (!need_whole_block)
>  			return fallback_mt;
>  
> -		if (*can_steal)
> +		if (*whole_block)
>  			return fallback_mt;
>  	}

This loop is quite awkward, but I think it actually gets more awkward
with the new names.

Consider this instead:

		*block_claimable = can_claim_block(order, migratetype);
		if (*block_claimable || !need_whole_block)
			return fallback_mt;

Or better yet, inline that function completely. There are no other
callers, and consolidating the rules into fewer places would IMO go a
long way of making it easier to follow:

		if (order >= pageblock_order/2 ||
		    start_mt == MIGRATE_RECLAIMABLE ||
		    start_mt == MIGRATE_UNMOVABLE)
			*block_claimable = true;

		if (*block_claimable || !need_whole_block)
			return fallback_mt;