lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46cb50d65e414bfd9bef5549d68ae4ea@AcuMS.aculab.com>
Date: Sat, 8 Jun 2024 14:44:12 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Linus Torvalds' <torvalds@...ux-foundation.org>, "Linux Kernel Mailing
 List" <linux-kernel@...r.kernel.org>
Subject: RE: Linux 6.10-rc2 - massive performance regression

I'm seeing a massive performance regression in 6.10-rc2 compared to 6.9-rc4.
I suspect the change is in 6.10-rc1

Some code that is (ought to be) mostly userspace is taking about 2.5 times
longer to run (30 minutes instead of 12).

The program will be pretty much running in userspace (fpga vhdl compiler).

I suspect it is due to the changes to the way the scheduler pre-empts
processes.

What make all the difference is an idle daemon that is pretty much
doing:
Thread 1:
	for (;;) {
		poll(0, 0, 10);
		pthread_cond_broadcast(cv); // [1]
	}
Threads 2+ (total one for each cpu):
	for (;;) {
		pthread_cond_wait(cv, mtx);
	}
ie all the threads wake up every 10ms, find there is nothing
to do and go back to sleep.
(They would be processing TDM or RTP audio.)

My suspicion is that threads 2+ are looping in the futex() call
until a scheduler timer tick instead of actually sleeping and
letting anything else run.

For this configuration all the daemon threads are running under the
default scheduler, but it usually better to run them at a low RT priority.

I can't remember whether there is a sysctl to change the scheduler behaviour,
but the current default is badly broken for some workloads.

	David

[1] It isn't that simple, the broadcast only wakes one thread which then
wakes the next (etc). Not only do the delay getting the cpu out of sleep
states accumulate, but if any thread can't run (eg if they are RT and the
cpu is busy) none of the later ones wake up.
So, in fact, there is a separate cv for each thread and the woken threads
also issue wakeups.
But that is all a different problem...)

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ