[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <035c23b4-118e-6a35-36d9-1b11e3d679f8@gmail.com>
Date: Fri, 15 Oct 2021 00:43:45 -0700
From: Norbert <nbrtt01@...il.com>
To: linux-kernel@...r.kernel.org
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Performance regression: thread wakeup time (latency) increased up to
3x
Performance regression: thread wakeup time (latency) increased up to 3x.
Happened between 5.13.8 and 5.14.0. Still happening at least on 5.14.11.
Short story:
------------
On isolated CPUs, wakeup increased from 1080 ns to 3550 ns. (3.3x)
On non-isol. CPUs, wakeup increased from 980 ns to 1245 ns. (1.3x)
Such an increase is surely not an expected part of an intentional
change, especially considering that threads on isolated CPUs are often
latency sensitive. Also, for example, it significantly increases
throughput on contended locks in general (1.3x).
Long Story:
-----------
Time measured from before futex-wake on thread A, to after futex-wait
returns on thread B.
Times are similar for eventfd write -> blocked-read, just a bit higher.
Thread A and B have affinity set on two neighboring CPUs on Threadripper
Zen2 CPU at fixed frequency 4.0 Ghz. On isolated CPUs, with SCHED_FIFO,
on non-isolated CPUs with SCHED_OTHER, however that does not make a big
difference (I also measured the other combinations).
Measured 5.13.0, 5.13.8, 5.14.0, 5.14.9 and 5.14.11.
Some on Fedora 35 Beta, some on ClearLinux 35100.
All given times are measured with multi-user.target (no GUI shell).
Times on graphical.target (with GUI shell) are about 10% higher.
These values are not an average of separate shorter and longer times:
This is a typical distribution:
(None are less than 3300 ns, and none are more than 5099 ns.)
count with 33nn ns: 858
count with 34nn ns: 19359
count with 35nn ns: 57257
count with 36nn ns: 6135
count with 37nn ns: 150
count with 38nn ns: 48
count with 39nn ns: 11
count with 40nn ns: 10
count with 41nn ns: 10
count with 42nn ns: 10
count with 43nn ns: 7
count with 44nn ns: 11
count with 45nn ns: 3
count with 46nn ns: 6
count with 47nn ns: 3
count with 48nn ns: 4
count with 49nn ns: 1
count with 50nn ns: 3
Also the times for the futex-wake call itself increased significantly:
On isolated CPUs, wake call increased from 510 ns to 710 ns. (1.4x)
On non-isol, CPUs, wake call increased from 420 ns to 580 ns. (1.4x)
This is my first time reporting a kernel problem, so please excuse if
this is not the right place or form. (Also I don't yet have the know-how
to bisect arbitrary kernel versions, or to compile specific patches.)
Powered by blists - more mailing lists