lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aLri71mWB52kklkF@localhost.localdomain>
Date: Fri, 5 Sep 2025 15:17:35 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Valentin Schneider <vschneid@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
	Anna-Maria Behnsen <anna-maria@...utronix.de>,
	Gabriele Monaco <gmonaco@...hat.com>,
	Ingo Molnar <mingo@...nel.org>, Jonathan Corbet <corbet@....net>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Marco Crivellari <marco.crivellari@...e.com>,
	Michal Hocko <mhocko@...nel.org>,
	"Paul E . McKenney" <paulmck@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>, Phil Auld <pauld@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Vlastimil Babka <vbabka@...e.cz>, Waiman Long <longman@...hat.com>,
	linux-doc@...r.kernel.org
Subject: Re: [PATCH] doc: Add CPU Isolation documentation

Le Mon, Aug 11, 2025 at 06:35:26PM +0200, Valentin Schneider a écrit :
> On 09/08/25 11:42, Frederic Weisbecker wrote:
> > nohz_full was introduced in v3.10 in 2013, which means this
> > documentation is overdue for 12 years.
> >
> 
> 12 years is not that bad, it's not old enough to drink (legally) yet!

;-)

> 
> > The shoemaker's children always go barefoot. And working on timers
> > hasn't made me arriving on time either.
> >
> > Fortunately Paul wrote a part of the needed documentation a while ago,
> > especially concerning nohz_full in Documentation/timers/no_hz.rst and
> > also about per-CPU kthreads in
> > Documentation/admin-guide/kernel-per-CPU-kthreads.rst
> >
> > Introduce a new page that gives an overview of CPU isolation in general.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > ---
> >  Documentation/admin-guide/cpu-isolation.rst | 338 ++++++++++++++++++++
> >  Documentation/admin-guide/index.rst         |   1 +
> >  2 files changed, 339 insertions(+)
> >  create mode 100644 Documentation/admin-guide/cpu-isolation.rst
> >
> > diff --git a/Documentation/admin-guide/cpu-isolation.rst b/Documentation/admin-guide/cpu-isolation.rst
> > new file mode 100644
> > index 000000000000..250027acf7b2
> > --- /dev/null
> > +++ b/Documentation/admin-guide/cpu-isolation.rst
> > @@ -0,0 +1,338 @@
> > +=============
> > +CPU Isolation
> > +=============
> > +
> > +Introduction
> > +============
> > +
> > +"CPU Isolation" means leaving a CPU exclusive to a given userspace
>                                                             ^^^^^^^^^
> Eh I'm being nitpicky, but this doesn't have to be userspace stuff right?
> "someone" could e.g. affine some IRQ to an isolated CPU to have the
> irqthread run undisturbed there, or somesuch.

Good point!

> > +
> > +Scheduler domain isolation
> > +--------------------------
> > +
> > +This feature isolates a CPU from the scheduler topology. As a result,
> > +the target isn't part of the load balancing. Tasks won't migrate
> > +neither from nor to it unless affine explicitly.
>                                  ^^^^^^
> s/affine/affined/

Right.

> 
> > +As a side effect the CPU is also isolated from unbound workqueues and
> > +unbound kthreads.
> 
> > +Checklist
> > +=========
> > +
> > +You have set up each of the above isolation features but you still
> > +observe jitters that trash your workload? Make sure to check a few
> > +elements before proceeding.
> > +
> > +Some of these checklist items are similar to those of real time
> > +workloads:
> > +
> > +- Use mlock() to prevent your pages from being swapped away. Page
> > +  faults are usually not compatible with jitter sensitive workloads.
> > +
> > +- Avoid SMT to prevent your hardware thread from being "preempted"
> > +  by another one.
> > +
> > +- CPU frequency changes may induce subtle sorts of jitter in a
> > +  workload. Cpufreq should be used and tuned with caution.
> > +
> > +- Deep C-states may result in latency issues upon wake-up. If this
> > +  happens to be a problem, C-states can be limited via kernel boot
> > +  parameters such as processor.max_cstate or intel_idle.max_cstate.
> > +
> 
> Nitpickery again, I know it's not an exhaustive listing, but I'd rather
> point to the sysfs cpuidle interface (or just mention it too), since that
> means deep C-states can be left enabled for HK CPUs.

Yes!

> 
> 
> Should we also mention BIOS/firmware fuckery like SMIs?
> 
> """
> - Your system may be subject to firmware-originating interrupts - x86 has
>   System Management Interrupts (SMIs) for example. Check your system BIOS
>   to disable such interference, and with some luck your vendor will have
>   a BIOS tuning guidance for low-latency operations.
> """

Definetely!

> 
> > +Debugging
> > +=========
> > +
> > +Of course things are never so easy, especially on this matter.
> > +Chances are that actual noise will be observed in the aforementioned
> > +trace.7 file.
> > +
> > +The best way to investigate further is to enable finer grained
> > +tracepoints such as those of subsystems producing asynchronous
> > +events: workqueue, timer, irq_vector, etc... It also can be
> > +interesting to enable the tick_stop event to diagnose why the tick is
> > +retained when that happens.
> > +
> 
> I'd also list the 'ipi_send*' family, although that's emitted from the HK
> CPU, not the disturbed isolated CPU.

Yeah I can do that.

> 
> > +Some tools may also be useful for higher level analysis:
> > +
> > +- :ref:`Documentation/tools/rtla/rtla-osnoise.rst <rtla-osnoise>` runs a kernel
> > +  tracer that analyzes and output a
> > +  summary of the noises.
> > +
> 
> I'd want to point to hwnoise and timerlat as well, so maybe point to
> rtla.rst?

Good point.

Thanks!

> 
> > +- dynticks-testing does something similar but in userspace. It is available
> > +  at git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git
> > diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> > index 259d79fbeb94..b5f1fc7d5290 100644
> > --- a/Documentation/admin-guide/index.rst
> > +++ b/Documentation/admin-guide/index.rst
> > @@ -94,6 +94,7 @@ likely to be of interest on almost any system.
> >
> >     cgroup-v2
> >     cgroup-v1/index
> > +   cpu-isolation
> >     cpu-load
> >     mm/index
> >     module-signing
> > --
> > 2.50.1
> 

-- 
Frederic Weisbecker
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ