linux-kernel - ll"RE: [tip: sched/urgent] sched/fair: Fix EEVDF entity placement bug causing scheduling lag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <002101dbb349$03e2c7a0$0ba856e0$@telus.net>
Date: Mon, 21 Apr 2025 22:40:10 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Alexander Egorenkov'" <egorenar@...ux.ibm.com>,
	<tip-bot2@...utronix.de>
Cc: <linux-kernel@...r.kernel.org>,
	<linux-tip-commits@...r.kernel.org>,
	<mingo@...nel.org>,
	<peterz@...radead.org>,
	<x86@...nel.org>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: ll"RE: [tip: sched/urgent] sched/fair: Fix EEVDF entity placement bug causing scheduling lag

On 2025.04.17 02:57 Alexander Egorenkov wrote:

> Hi Peter,
>
> after this change, we are seeing big latencies when trying to execute a
> simple command per SSH on a Fedora 41 s390x remote system which is under
> stress.
>
> I was able to bisect the problem to this commit.
>
> The problem is easy to reproduce with stress-ng executed on the remote
> system which is otherwise unoccupied and concurrent SSH connect attempts
> from a local system to the remote one.
>
> stress-ng (on remote system)
> ----------------------------
>
> $ cpus=$(nproc)
> $ stress-ng --cpu $((cpus * 2)) --matrix 50 --mq 50 --aggressive --brk 2
>            --stack 2 --bigheap 2 --userfaultfd 0 --perf -t 5m

That is a very very stressful test. It crashes within a few seconds on my test computer,
with a " Segmentation fault (core dumped)" message.
If I back it off to this:

$ stress-ng --cpu 24 --matrix 50 --mq 50 --aggressive --brk 2 --stack 2 --bigheap 2 -t 300m

It runs, but still makes a great many entries in /var/log/kern.log as the oom killer runs etc.
I am suggesting it is not a reasonable test workload.

Anyway, I used turbostat the same way I was using it back in January for this work, and did observe
longer than requested intervals.
I took 1427 samples and got 10 where the interval time was more than 1 second more than requested.
The worst was 7.5 seconds longer than requested.

I rechecked the 100% workload used in January (12X "yes > dev/null") and it was fine.
3551 samples and the actual interval was never more than 10 milliseconds longer than requested.

Kernel 6.15-rc2
Turbostat version: 2025.04.06
Turbostat sample interval: 2 seconds.
Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz (12 CPU, 6 cores)

... Doug