lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250716134640.GA20846@pauld.westford.csb>
Date: Wed, 16 Jul 2025 09:46:40 -0400
From: Phil Auld <pauld@...hat.com>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
	mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, clm@...a.com
Subject: Re: [PATCH v2 00/12] sched: Address schbench regression


Hi Peter,

On Mon, Jul 07, 2025 at 03:08:08PM +0530 Shrikanth Hegde wrote:
> 
> 
> On 7/7/25 14:41, Peter Zijlstra wrote:
> > On Mon, Jul 07, 2025 at 02:35:38PM +0530, Shrikanth Hegde wrote:
> > > 
> > > 
> > > On 7/2/25 17:19, Peter Zijlstra wrote:
> > > > Hi!
> > > > 
> > > > Previous version:
> > > > 
> > > >     https://lkml.kernel.org/r/20250520094538.086709102@infradead.org
> > > > 
> > > > 
> > > > Changes:
> > > >    - keep dl_server_stop(), just remove the 'normal' usage of it (juril)
> > > >    - have the sched_delayed wake list IPIs do select_task_rq() (vingu)
> > > >    - fixed lockdep splat (dietmar)
> > > >    - added a few preperatory patches
> > > > 
> > > > 
> > > > Patches apply on top of tip/master (which includes the disabling of private futex)
> > > > and clm's newidle balance patch (which I'm awaiting vingu's ack on).
> > > > 
> > > > Performance is similar to the last version; as tested on my SPR on v6.15 base:
> > > > 
> > > 
> > > 
> > > Hi Peter,
> > > Gave this a spin on a machine with 5 cores (SMT8) PowerPC system.
> > > 
> > > I see significant regression in schbench. let me know if i have to test different
> > > number of threads based on the system size.
> > > Will go through the series and will try a bisect meanwhile.
> > 
> > Urgh, those are terrible numbers :/
> > 
> > What do the caches look like on that setup? Obviously all the 8 SMT
> > (is this the supercore that glues two SMT4 things together for backwards
> > compat?) share some cache, but is there some shared cache between the
> > cores?
> 
> It is a supercore(we call it as bigcore) which glues two SMT4 cores. LLC is
> per SMT4 core. So from scheduler perspective system is 10 cores (SMT4)
> 

We've confirmed the issue with schbench on EPYC hardware. It's not limited
to PPC systems, although this system may also have interesting caching. 
We don't see issues with our other tests.

---------------

Here are the latency reports from schbench on a single-socket AMD EPYC
9655P server with 96 cores and 192 CPUs.

Results for this test:
./schbench/schbench -L -m 4 -t 192 -i 30 -r 30

6.15.0-rc6  baseline
threads  wakeup_99_usec  request_99_usec
1        5               3180
16       5               3996
64       3452            14256
128      7112            32960
192      11536           46016

6.15.0-rc6.pz_fixes2 (with 12 part series))
threads  wakeup_99_usec  request_99_usec
1        5               3172
16       5               3844
64       3348            17376
128      21024           100480
192      44224           176384

For 128 and 192 threads, Wakeup and Request latencies increased by a factor of
3x.

We're testing now with NO_TTWU_QUEUE_DELAYED and I'll try to report on
that when we have results. 

Cheers,
Phil
-- 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ