lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240425104220.GE21980@noisy.programming.kicks-ass.net>
Date: Thu, 25 Apr 2024 12:42:20 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Luis Machado <luis.machado@....com>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
	linux-kernel@...r.kernel.org, kprateek.nayak@....com,
	wuyun.abel@...edance.com, tglx@...utronix.de, efault@....de,
	nd <nd@....com>, John Stultz <jstultz@...gle.com>
Subject: Re: [RFC][PATCH 08/10] sched/fair: Implement delayed dequeue

On Wed, Apr 24, 2024 at 04:15:42PM +0100, Luis Machado wrote:

> > Bisecting through the patches in this series, I ended up with patch 08/10
> > as the one that improved things overall for these benchmarks.
> > 
> > I'd like to investigate this further to understand the reason behind some of
> > these dramatic improvements.
> > 
> 
> Investigating these improvements a bit more, I noticed they came with a significantly
> higher power usage on the Pixel6 (where EAS is enabled). I bisected it down to the delayed
> dequeue patch. Disabling DELAY_DEQUEUE and DELAY_ZERO at runtime doesn't help in bringing
> the power usage down.

Hmm, that is unexpected. The intent was for NO_DELAY_DEQUEUE to fully
disable things. I'll go have a prod at it.

> Though I don't fully understand the reason behind this change in behavior yet, I did spot
> the benchmark processes running almost entirely on the big core cluster, with little
> to no use of the little core and mid core clusters.
> 
> That would explain higher power usage and also the significant jump in performance.

ISTR you (arm) has these tools to trace and plot the varioud util
values. This should be readily reflected there if that is the case, no?

> I wonder if the delayed dequeue logic is having an unwanted effect on the calculation of
> utilization/load of the runqueue and, as a consequence, we're scheduling things to run on
> higher OPP's in the big cores, leading to poor decisions for energy efficiency.

Notably util_est_update() gets delayed. Given we don't actually do an
enqueue when a delayed task gets woken, it didn't seem to make sense to
update that sooner.

I'll go over all that again.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ