lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 9 Jun 2024 09:35:36 +0000
From: David Laight <David.Laight@...LAB.COM>
To: Linux kernel regressions list <regressions@...ts.linux.dev>
CC: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Linus Torvalds
	<torvalds@...ux-foundation.org>
Subject: RE: Linux 6.10-rc2 - massive performance regression

From: Linux regression tracking (Thorsten Leemhuis)
> Sent: 09 June 2024 09:11
> 
> On 09.06.24 00:00, Linus Torvalds wrote:
> > On Sat, 8 Jun 2024 at 14:36, David Laight <David.Laight@...lab.com> wrote:
> > [...]
> >> I've done some tests.
> >> I'm seeing a three-fold slow down on:
> >> $ i=0; while [ $i -lt 1000000 ]; do i=$((i + 1)); done
> >> which goes from 1 second to 3.
> >>
> >> I can run that with ftrace monitoring scheduler events (and a few
> >> other things) and can't spot anywhere the process isn't running
> >> for a significant time.
> >
> > Sounds like cpu frequency. Almost certainly hw-specific. I went
> > through that on my Threadripper in the 6.9 timeframe, but I'm not
> > seeing any issues in this current release.
> 
> David, what kind of hardware do you use?

This is on an 17-7700 (4 cores + hyperthreading enabled = 8 cpu).

> Johan Hovold as
> man-in-the-middle just reported "CPU frequency of the big cores on the
> Lenovo ThinkPad X13s sometimes appears to get stuck at a low frequency
> with 6.10-rc2" and confirmed "that once the cores are fully throttled
> (using the stepwise thermal governor) due to the skin temperature
> reaching the
> first trip point, scaling_max_freq gets stuck at the next OPP".

That's not what I'm seeing.
I can turn the speed up and down by stopping/starting a daemon we use
for processing audio.
(I can give anyone a copy; it is freely downloadable from the company
web site - if you know exactly where to look!)
Basically that ends up running a bit of code on every cpu every 10ms.

There is a big difference in the number of sched_migrate_task traces
between 6.9 and 6.10 (15 v 83).

I suspect that the underlying problem is that the cpu governor
doesn't allow for a 'busy' process being moved to an idle cpu?
So if you bounce a process about it always runs an 800MHz.

My dmesg (6.9 and 6.10) has:
cpuidle: using governor idle
cpuidle: using governor ladder

But I had a feeling that some 'hardware magic' changes the cpu
speed on these systems?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ