lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <B27ECDA1-632D-44CD-AB99-B7A9C27393E4@amazon.com>
Date: Fri, 2 May 2025 17:25:14 +0000
From: "Prundeanu, Cristian" <cpru@...zon.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: K Prateek Nayak <kprateek.nayak@....com>, "Mohamed Abuelfotoh, Hazem"
	<abuehaze@...zon.com>, "Saidi, Ali" <alisaidi@...zon.com>, "Benjamin
 Herrenschmidt" <benh@...nel.crashing.org>, "Blake, Geoff"
	<blakgeof@...zon.com>, "Csoma, Csaba" <csabac@...zon.com>, "Doebel, Bjoern"
	<doebel@...zon.de>, Gautham Shenoy <gautham.shenoy@....com>, Swapnil Sapkal
	<swapnil.sapkal@....com>, Joseph Salisbury <joseph.salisbury@...cle.com>,
	Dietmar Eggemann <dietmar.eggemann@....com>, Ingo Molnar <mingo@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>, Borislav Petkov
	<bp@...en8.de>, "linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-tip-commits@...r.kernel.org"
	<linux-tip-commits@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: EEVDF regression still exists

On 2025-04-30, 05:03, "Peter Zijlstra" <peterz@...radead.org <mailto:peterz@...radead.org>> wrote:

> Anyway, looking at the two individual reports side by side:
>
> - schedule() left the processor idle -- is up
>
> vs.
>
> - pull_task() count on cpu newly idle -- is down
> - load_balance() success count on cpu newly idle -- is down
>
> Which seem related and would suggest we look at newidle balance. One of
> the things we've seen before is that newidle was affected by the shorter
> slice of EEVDF. But it is also quite possible something changed in the
> load-balancer here.
>
> Also of note is that .15 seems to have a lower number of 'ttwu() was
> called to wake up on the local cpu' -- which I'm not quite sure how to
> rhyme with the previous observation. The newidle thing seems to suggest
> not enough migrations, while this would suggest too many migrations.

A 2x longer slice on 6.15 does improve performance some, but not by a lot.
I went back to look at my previous tests, and back in September I did try
multiple slice values (1.5ms, 3ms, 6ms, 12ms) on 6.5 and 6.6. The response
was noisy (much less on CFS however), and not linear, peaking at 3ms.
Does the lack of linearity match your expectations? Would it have reason
to change in more recent kernels?

Another, more recent observation is that 6.15-rc4 has worse performance than
rc3 and earlier kernels. Maybe that can help narrow down the cause?
I've added the perf reports for rc3 and rc2 in the same location as before.

https://github.com/aws/repro-collection/blob/main/repros/repro-mysql-EEVDF-regression/results/20250428/README.md#raw-data


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ