lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250430100259.GK4439@noisy.programming.kicks-ass.net>
Date: Wed, 30 Apr 2025 12:02:59 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Cristian Prundeanu <cpru@...zon.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>,
	Hazem Mohamed Abuelfotoh <abuehaze@...zon.com>,
	Ali Saidi <alisaidi@...zon.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Geoff Blake <blakgeof@...zon.com>, Csaba Csoma <csabac@...zon.com>,
	Bjoern Doebel <doebel@...zon.com>,
	Gautham Shenoy <gautham.shenoy@....com>,
	Swapnil Sapkal <swapnil.sapkal@....com>,
	Joseph Salisbury <joseph.salisbury@...cle.com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ingo Molnar <mingo@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Borislav Petkov <bp@...en8.de>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-tip-commits@...r.kernel.org, x86@...nel.org
Subject: Re: EEVDF regression still exists

On Tue, Apr 29, 2025 at 04:38:17PM -0500, Cristian Prundeanu wrote:

> [1] https://github.com/aws/repro-collection/blob/main/repros/repro-mysql-EEVDF-regression/results/20250428/README.md

That 'perf sched stats diff' output is completely broken -- probably
trying to diff two different schedstat versions isn't working.

Anyway, looking at the two individual reports side by side:

 - schedule() left the processor idle             -- is up

vs.

 - pull_task() count on cpu newly idle            -- is down
 - load_balance() success count on cpu newly idle -- is down

Which seem related and would suggest we look at newidle balance. One of
the things we've seen before is that newidle was affected by the shorter
slice of EEVDF. But it is also quite possible something changed in the
load-balancer here.

Also of note is that .15 seems to have a lower number of 'ttwu() was
called to wake up on the local cpu' -- which I'm not quite sure how to
rhyme with the previous observation. The newidle thing seems to suggest
not enough migrations, while this would suggest too many migrations.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ