[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260122080937.22347-2-sunlightlinux@gmail.com>
Date: Thu, 22 Jan 2026 10:09:37 +0200
From: "Ionut Nechita (Sunlight Linux)" <sunlightlinux@...il.com>
To: rafael@...nel.org
Cc: daniel.lezcano@...aro.org,
christian.loehle@....com,
linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org,
yumpusamongus@...il.com,
Ionut Nechita <ionut_n2001@...oo.com>
Subject: [PATCH v2 0/1] cpuidle: menu: Fix high wakeup latency on modern platforms
From: Ionut Nechita <ionut_n2001@...oo.com>
Hi,
This v2 patch addresses high wakeup latency in the menu cpuidle governor
on modern platforms with high C-state exit latencies.
Changes in v2:
==============
Based on Christian Loehle's feedback, I've simplified the approach to use
min(predicted_ns, data->next_timer_ns) instead of the 25% safety margin
from v1.
The simpler approach is cleaner and achieves the same goal: preventing the
governor from selecting excessively deep C-states when the prediction
suggests a short idle period but next_timer_ns is large (e.g., 10ms).
I will test both approaches (simple min vs 25% margin) and provide
detailed comparison data including:
- C-state residency tables
- Usage statistics
- Idle miss counts (above/below)
- Actual latency measurements
Thank you Christian for the valuable feedback and for pointing out that
the simpler approach may be sufficient.
Background:
===========
On Intel server platforms from 2022 onwards (Sapphire Rapids, Granite
Rapids), we observe excessive wakeup latencies (~150us) in network-
sensitive workloads when using the menu governor with NOHZ_FULL enabled.
The issue stems from the governor using next_timer_ns directly when the
tick is already stopped and predicted_ns < TICK_NSEC. This causes
selection of very deep package C-states (PC6) even when the prediction
suggests a much shorter idle duration.
On platforms with high C-state exit latencies (Intel SPR: 190us for C6,
or systems with large C-state gaps like C2 36us → C3 700us with 350us
exit latency), this results in significant wakeup penalties.
Testing:
========
Initial testing on Sapphire Rapids shows 5x latency reduction
(151us → ~30us). I will provide comprehensive test results comparing
baseline, simple min(), and the 25% margin approach.
Ionut Nechita (1):
cpuidle: menu: Use min() to prevent deep C-states when tick is stopped
drivers/cpuidle/governors/menu.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
--
2.52.0
Powered by blists - more mailing lists