lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sun, 9 Jun 2024 14:24:50 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>
CC: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, 'Woody Suwalski'
	<terraluna977@...il.com>, 'Johan Hovold' <johan@...nel.org>
Subject: RE: Linux 6.10-rc2 - massive performance regression

From: Linus Torvalds
> Sent: 08 June 2024 23:01
> 
> On Sat, 8 Jun 2024 at 14:36, David Laight <David.Laight@...lab.com> wrote:
> >
> > I'll try to remember how to bisect through the merge :-)
> 
> git bisect should just do all the work for you. All you need to do is
> give a know good and bad point, and keep testing what git bisect asks
> you to do.

That would all be easier if the kernel version didn't keep changing
and/or after 'make install' grub defaulted to booting the last
built kernel.
I may already have 'fixed' this system so it doesn't default
to booting the 'last booted' kernel - a real PITA when you are
trying to fix non-booting kernels.

Anyway I completely failed to manage to build a 'good' kernel.
Even 6.9-rc5 failed and bisecting between 6.9-rc4 and 6.9.rc5
ended up building a 6.9-rc3+ kernel and 'git diff v6.9-rc4'
was giving massive changed even though 'git bisect view' only
gave a few changes that couldn't be relevant.

I finally realised what the different between 'good' and 'bad'
kernels was.
All down to CONFIG_SPECULATION_MITIGATIONS being renamed
CONFIG_CPU_MITIGATIONS and getting enabled 'by mistake'.

If I build a 6.10-rc2 kernel without the mitigations I get
the 'fast' behaviour.
So there must actually be something quite subtle in the
timings.

So there is still a problem that if a cpu-intensive process
get moved to a different cpu on a 'mostly idle' system then the
new cpu is likely to be running at a low frequency and will
take a while to speed up.
Move it often enough and it will run very slowly.
I suspect that something like (untested):

cpu=0; while [ $cpu -lt $num_cpu ]; do
	taskset --cpu-list $cpu sh -c 'while sleep 0.01; do :; done' &
	cpu=$((cpu + 1))
done

will cause a cpu-bound process to run very slowly.

I think that ought to be considered a bug.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ