[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <002101dbb349$03e2c7a0$0ba856e0$@telus.net>
Date: Mon, 21 Apr 2025 22:40:10 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Alexander Egorenkov'" <egorenar@...ux.ibm.com>,
<tip-bot2@...utronix.de>
Cc: <linux-kernel@...r.kernel.org>,
<linux-tip-commits@...r.kernel.org>,
<mingo@...nel.org>,
<peterz@...radead.org>,
<x86@...nel.org>,
"Doug Smythies" <dsmythies@...us.net>
Subject: ll"RE: [tip: sched/urgent] sched/fair: Fix EEVDF entity placement bug causing scheduling lag
On 2025.04.17 02:57 Alexander Egorenkov wrote:
> Hi Peter,
>
> after this change, we are seeing big latencies when trying to execute a
> simple command per SSH on a Fedora 41 s390x remote system which is under
> stress.
>
> I was able to bisect the problem to this commit.
>
> The problem is easy to reproduce with stress-ng executed on the remote
> system which is otherwise unoccupied and concurrent SSH connect attempts
> from a local system to the remote one.
>
> stress-ng (on remote system)
> ----------------------------
>
> $ cpus=$(nproc)
> $ stress-ng --cpu $((cpus * 2)) --matrix 50 --mq 50 --aggressive --brk 2
> --stack 2 --bigheap 2 --userfaultfd 0 --perf -t 5m
That is a very very stressful test. It crashes within a few seconds on my test computer,
with a " Segmentation fault (core dumped)" message.
If I back it off to this:
$ stress-ng --cpu 24 --matrix 50 --mq 50 --aggressive --brk 2 --stack 2 --bigheap 2 -t 300m
It runs, but still makes a great many entries in /var/log/kern.log as the oom killer runs etc.
I am suggesting it is not a reasonable test workload.
Anyway, I used turbostat the same way I was using it back in January for this work, and did observe
longer than requested intervals.
I took 1427 samples and got 10 where the interval time was more than 1 second more than requested.
The worst was 7.5 seconds longer than requested.
I rechecked the 100% workload used in January (12X "yes > dev/null") and it was fine.
3551 samples and the actual interval was never more than 10 milliseconds longer than requested.
Kernel 6.15-rc2
Turbostat version: 2025.04.06
Turbostat sample interval: 2 seconds.
Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz (12 CPU, 6 cores)
... Doug
Powered by blists - more mailing lists