lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260106153646.23280-2-sunlightlinux@gmail.com>
Date: Tue,  6 Jan 2026 17:36:46 +0200
From: "Ionut Nechita (Sunlight Linux)" <sunlightlinux@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>,
	Frederic Weisbecker <frederic@...nel.org>,
	Ingo Molnar <mingo@...nel.org>,
	Anna-Maria Behnsen <anna-maria@...utronix.de>,
	Ionut Nechita <ionut_n2001@...oo.com>
Cc: linux-kernel@...r.kernel.org
Subject: [PATCH 0/1] tick/nohz: Optimize tick stopping for isolated cores

From: Ionut Nechita <ionut_n2001@...oo.com>

This patch optimizes the tick stopping mechanism for nohz_full isolated
CPUs by introducing a fast-path that reduces timer interrupt overhead on
idle isolated cores.

Background:
-----------
CPU isolation with nohz_full is critical for latency-sensitive workloads
such as real-time applications, high-frequency trading, audio processing,
and gaming. The current implementation performs extensive dependency
checks even when the CPU is idle with no active dependencies, leading to
unnecessary overhead and delayed tick stopping decisions.

The Problem:
------------
When an isolated CPU becomes idle, the kernel checks multiple dependency
masks (global, per-CPU, task, and signal group) through function calls
that include tracing overhead. This checking process, while thorough,
introduces measurable latency that can cause:
1. Delayed tick stopping decisions
2. More frequent tick restarts
3. Higher interrupt overhead (LOC - Local timer interrupts)
4. Reduced effectiveness of CPU isolation

Implementation:
---------------
The patch adds two optimizations to can_stop_full_tick():

1. Prefetching: The dependency structures are prefetched into CPU cache
   before they are accessed, reducing memory latency for both the fast
   and slow paths.

2. Fast-path: For idle isolated CPUs with no dependencies, we perform
   simple atomic reads of the dependency masks. If all are zero, we
   immediately return true, skipping:
   - 4 function calls to check_tick_dependency()
   - Multiple branch predictions and tracing points
   - Additional atomic operations within those functions

Benchmark Results:
------------------
Testing was performed on systems with nohz_full configured CPUs running
idle workloads:

Before patch:
- Moderately isolated CPUs: ~8000 LOC interrupts
- Well-isolated CPUs:       ~500-1000 LOC interrupts

After patch:
- Moderately isolated CPUs: <500 LOC interrupts (94% reduction)
- Well-isolated CPUs:       122-125 LOC interrupts (75-88% reduction)

The improvement is most significant on CPUs that frequently transition
between idle and active states, which is common in real-time workloads.

Testing Methodology:
--------------------
Tests were conducted by:
1. Booting with nohz_full=<cpu_list> isolcpus=<cpu_list>
2. Running isolated workloads with periodic idle transitions
3. Monitoring /proc/interrupts LOC counter over 10-minute periods
4. Comparing interrupt counts with and without the patch
5. Testing across multiple CPU architectures and workload patterns

Impact:
-------
This optimization is transparent to existing code and maintains all
safety guarantees. The fast-path only triggers when all dependency
checks would pass anyway, so there is no functional change - only
improved performance.

The patch benefits any system using nohz_full CPU isolation, including:
- Real-time systems (PREEMPT_RT)
- Low-latency audio/video processing
- High-frequency trading applications
- Gaming systems with dedicated CPU cores
- Scientific computing with isolated calculation cores

Future Work:
------------
Additional optimizations could include:
- Per-CPU statistics to measure fast-path hit rate
- Architecture-specific prefetch optimizations
- Extended fast-path for non-idle but single-task scenarios

Ionut Nechita (1):
  tick/nohz: Add fast-path tick stopping for idle isolated cores

 kernel/time/tick-sched.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

--
2.52.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ