lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e638027-7c50-4af8-ab6b-fa0a438098a2@suse.cz>
Date: Thu, 10 Apr 2025 10:51:49 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Johannes Weiner <hannes@...xchg.org>,
 Andrew Morton <akpm@...ux-foundation.org>
Cc: Brendan Jackman <jackmanb@...gle.com>,
 Mel Gorman <mgorman@...hsingularity.net>, Carlos Song <carlos.song@....com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: page_alloc: tighten up find_suitable_fallback()

On 4/7/25 20:01, Johannes Weiner wrote:
> find_suitable_fallback() is not as efficient as it could be, and
> somewhat difficult to follow.
> 
> 1. should_try_claim_block() is a loop invariant. There is no point in
>    checking fallback areas if the caller is interested in claimable
>    blocks but the order and the migratetype don't allow for that.
> 
> 2. __rmqueue_steal() doesn't care about claimability, so it shouldn't
>    have to run those tests.
> 
> Different callers want different things from this helper:
> 
> 1. __compact_finished() scans orders up until it finds a claimable block
> 2. __rmqueue_claim() scans orders down as long as blocks are claimable
> 3. __rmqueue_steal() doesn't care about claimability at all
> 
> Move should_try_claim_block() out of the loop. Only test it for the
> two callers who care in the first place. Distinguish "no blocks" from
> "order + mt are not claimable" in the return value; __rmqueue_claim()
> can stop once order becomes unclaimable, __compact_finished() can keep
> advancing until order becomes claimable.
> 
> Before:
> 
>  Performance counter stats for './run case-lru-file-mmap-read' (5 runs):
> 
> 	 85,294.85 msec task-clock                       #    5.644 CPUs utilized               ( +-  0.32% )
> 	    15,968      context-switches                 #  187.209 /sec                        ( +-  3.81% )
> 	       153      cpu-migrations                   #    1.794 /sec                        ( +-  3.29% )
> 	   801,808      page-faults                      #    9.400 K/sec                       ( +-  0.10% )
>    733,358,331,786      instructions                     #    1.87  insn per cycle              ( +-  0.20% )  (64.94%)
>    392,622,904,199      cycles                           #    4.603 GHz                         ( +-  0.31% )  (64.84%)
>    148,563,488,531      branches                         #    1.742 G/sec                       ( +-  0.18% )  (63.86%)
>        152,143,228      branch-misses                    #    0.10% of all branches             ( +-  1.19% )  (62.82%)
> 
> 	   15.1128 +- 0.0637 seconds time elapsed  ( +-  0.42% )
> 
> After:
> 
>  Performance counter stats for './run case-lru-file-mmap-read' (5 runs):
> 
>          84,380.21 msec task-clock                       #    5.664 CPUs utilized               ( +-  0.21% )
>             16,656      context-switches                 #  197.392 /sec                        ( +-  3.27% )
>                151      cpu-migrations                   #    1.790 /sec                        ( +-  3.28% )
>            801,703      page-faults                      #    9.501 K/sec                       ( +-  0.09% )
>    731,914,183,060      instructions                     #    1.88  insn per cycle              ( +-  0.38% )  (64.90%)
>    388,673,535,116      cycles                           #    4.606 GHz                         ( +-  0.24% )  (65.06%)
>    148,251,482,143      branches                         #    1.757 G/sec                       ( +-  0.37% )  (63.92%)
>        149,766,550      branch-misses                    #    0.10% of all branches             ( +-  1.22% )  (62.88%)
> 
>            14.8968 +- 0.0486 seconds time elapsed  ( +-  0.33% )
> 
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>

Yay, you found a way to get rid of the ugly "bool claim_only, bool
*claim_block" parameter combo. Great!

Reviewed-by: Vlastimil Babka <vbabka@...e.cz>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ