linux-kernel - Re: [RFC PATCH 0/7] Optimization to reduce the cost of newidle balance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240717121745.GF26750@noisy.programming.kicks-ass.net>
Date: Wed, 17 Jul 2024 14:17:45 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Chen Yu <yu.c.chen@...el.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>,
	Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
	Tim Chen <tim.c.chen@...el.com>,
	Mel Gorman <mgorman@...hsingularity.net>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	K Prateek Nayak <kprateek.nayak@....com>,
	"Gautham R . Shenoy" <gautham.shenoy@....com>,
	Chen Yu <yu.chen.surf@...il.com>, Aaron Lu <aaron.lu@...el.com>,
	linux-kernel@...r.kernel.org, void@...ifault.com
Subject: Re: [RFC PATCH 0/7] Optimization to reduce the cost of newidle
 balance

On Thu, Jul 27, 2023 at 10:33:58PM +0800, Chen Yu wrote:
> Hi,
> 
> This is the second version of the newidle balance optimization[1].
> It aims to reduce the cost of newidle balance which is found to
> occupy noticeable CPU cycles on some high-core count systems.
> 
> For example, when running sqlite on Intel Sapphire Rapids, which has
> 2 x 56C/112T = 224 CPUs:
> 
> 6.69%    0.09%  sqlite3     [kernel.kallsyms]   [k] newidle_balance
> 5.39%    4.71%  sqlite3     [kernel.kallsyms]   [k] update_sd_lb_stats
> 
> To mitigate this cost, the optimization is inspired by the question
> raised by Tim:
> Do we always have to find the busiest group and pull from it? Would
> a relatively busy group be enough?

So doesn't this basically boil down to recognising that new-idle might
not be the same as regular load-balancing -- we need any task, fast,
rather than we need to make equal load.

David's shared runqueue patches did the same, they re-imagined this very
path.

Now, David's thing went side-ways because of some regression that wasn't
further investigated.

But it occurs to me this might be the same thing that Prateek chased
down here:

  https://lkml.kernel.org/r/20240710090210.41856-1-kprateek.nayak@amd.com

Hmm ?

Supposing that is indeed the case, I think it makes more sense to
proceed with that approach. That is, completely redo the sub-numa new
idle balance.