linux-kernel - Re: [PATCH 00/10] mm: PCP high auto-tuning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230920094118.8b8f739125c6aede17c627e0@linux-foundation.org>
Date:   Wed, 20 Sep 2023 09:41:18 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Huang Ying <ying.huang@...el.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Arjan Van De Ven <arjan@...ux.intel.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Hildenbrand <david@...hat.com>,
        Johannes Weiner <jweiner@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Matthew Wilcox <willy@...radead.org>,
        Christoph Lameter <cl@...ux.com>
Subject: Re: [PATCH 00/10] mm: PCP high auto-tuning

On Wed, 20 Sep 2023 14:18:46 +0800 Huang Ying <ying.huang@...el.com> wrote:

> The page allocation performance requirements of different workloads
> are often different.  So, we need to tune the PCP (Per-CPU Pageset)
> high on each CPU automatically to optimize the page allocation
> performance.

Some of the performance changes here are downright scary.

I've never been very sure that percpu pages was very beneficial (and
hey, I invented the thing back in the Mesozoic era).  But these numbers
make me think it's very important and we should have been paying more
attention.

> The list of patches in series is as follows,
> 
>  1 mm, pcp: avoid to drain PCP when process exit
>  2 cacheinfo: calculate per-CPU data cache size
>  3 mm, pcp: reduce lock contention for draining high-order pages
>  4 mm: restrict the pcp batch scale factor to avoid too long latency
>  5 mm, page_alloc: scale the number of pages that are batch allocated
>  6 mm: add framework for PCP high auto-tuning
>  7 mm: tune PCP high automatically
>  8 mm, pcp: decrease PCP high if free pages < high watermark
>  9 mm, pcp: avoid to reduce PCP high unnecessarily
> 10 mm, pcp: reduce detecting time of consecutive high order page freeing
> 
> Patch 1/2/3 optimize the PCP draining for consecutive high-order pages
> freeing.
> 
> Patch 4/5 optimize batch freeing and allocating.
> 
> Patch 6/7/8/9 implement and optimize a PCP high auto-tuning method.
> 
> Patch 10 optimize the PCP draining for consecutive high order page
> freeing based on PCP high auto-tuning.
> 
> The test results for patches with performance impact are as follows,
> 
> kbuild
> ======
> 
> On a 2-socket Intel server with 224 logical CPU, we tested kbuild on
> one socket with `make -j 112`.
> 
> 	build time	zone lock%	free_high	alloc_zone
> 	----------	----------	---------	----------
> base	     100.0	      43.6          100.0            100.0
> patch1	      96.6	      40.3	     49.2	      95.2
> patch3	      96.4	      40.5	     11.3	      95.1
> patch5	      96.1	      37.9	     13.3	      96.8
> patch7	      86.4	       9.8	      6.2	      22.0
> patch9	      85.9	       9.4	      4.8	      16.3
> patch10	      87.7	      12.6	     29.0	      32.3

You're seriously saying that kbuild got 12% faster?

I see that [07/10] (autotuning) alone sped up kbuild by 10%?

Other thoughts:

- What if any facilities are provided to permit users/developers to
  monitor the operation of the autotuning algorithm?

- I'm not seeing any Documentation/ updates.  Surely there are things
  we can tell users?

- This:

  : It's possible that PCP high auto-tuning doesn't work well for some
  : workloads.  So, when PCP high is tuned by hand via the sysctl knob,
  : the auto-tuning will be disabled.  The PCP high set by hand will be
  : used instead.

  Is it a bit hacky to disable autotuning when the user alters
  pcp-high?  Would it be cleaner to have a separate on/off knob for
  autotuning?

  And how is the user to determine that "PCP high auto-tuning doesn't work
  well" for their workload?