[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aLri71mWB52kklkF@localhost.localdomain>
Date: Fri, 5 Sep 2025 15:17:35 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Valentin Schneider <vschneid@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>,
Gabriele Monaco <gmonaco@...hat.com>,
Ingo Molnar <mingo@...nel.org>, Jonathan Corbet <corbet@....net>,
Marcelo Tosatti <mtosatti@...hat.com>,
Marco Crivellari <marco.crivellari@...e.com>,
Michal Hocko <mhocko@...nel.org>,
"Paul E . McKenney" <paulmck@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, Phil Auld <pauld@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>, Waiman Long <longman@...hat.com>,
linux-doc@...r.kernel.org
Subject: Re: [PATCH] doc: Add CPU Isolation documentation
Le Mon, Aug 11, 2025 at 06:35:26PM +0200, Valentin Schneider a écrit :
> On 09/08/25 11:42, Frederic Weisbecker wrote:
> > nohz_full was introduced in v3.10 in 2013, which means this
> > documentation is overdue for 12 years.
> >
>
> 12 years is not that bad, it's not old enough to drink (legally) yet!
;-)
>
> > The shoemaker's children always go barefoot. And working on timers
> > hasn't made me arriving on time either.
> >
> > Fortunately Paul wrote a part of the needed documentation a while ago,
> > especially concerning nohz_full in Documentation/timers/no_hz.rst and
> > also about per-CPU kthreads in
> > Documentation/admin-guide/kernel-per-CPU-kthreads.rst
> >
> > Introduce a new page that gives an overview of CPU isolation in general.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > ---
> > Documentation/admin-guide/cpu-isolation.rst | 338 ++++++++++++++++++++
> > Documentation/admin-guide/index.rst | 1 +
> > 2 files changed, 339 insertions(+)
> > create mode 100644 Documentation/admin-guide/cpu-isolation.rst
> >
> > diff --git a/Documentation/admin-guide/cpu-isolation.rst b/Documentation/admin-guide/cpu-isolation.rst
> > new file mode 100644
> > index 000000000000..250027acf7b2
> > --- /dev/null
> > +++ b/Documentation/admin-guide/cpu-isolation.rst
> > @@ -0,0 +1,338 @@
> > +=============
> > +CPU Isolation
> > +=============
> > +
> > +Introduction
> > +============
> > +
> > +"CPU Isolation" means leaving a CPU exclusive to a given userspace
> ^^^^^^^^^
> Eh I'm being nitpicky, but this doesn't have to be userspace stuff right?
> "someone" could e.g. affine some IRQ to an isolated CPU to have the
> irqthread run undisturbed there, or somesuch.
Good point!
> > +
> > +Scheduler domain isolation
> > +--------------------------
> > +
> > +This feature isolates a CPU from the scheduler topology. As a result,
> > +the target isn't part of the load balancing. Tasks won't migrate
> > +neither from nor to it unless affine explicitly.
> ^^^^^^
> s/affine/affined/
Right.
>
> > +As a side effect the CPU is also isolated from unbound workqueues and
> > +unbound kthreads.
>
> > +Checklist
> > +=========
> > +
> > +You have set up each of the above isolation features but you still
> > +observe jitters that trash your workload? Make sure to check a few
> > +elements before proceeding.
> > +
> > +Some of these checklist items are similar to those of real time
> > +workloads:
> > +
> > +- Use mlock() to prevent your pages from being swapped away. Page
> > + faults are usually not compatible with jitter sensitive workloads.
> > +
> > +- Avoid SMT to prevent your hardware thread from being "preempted"
> > + by another one.
> > +
> > +- CPU frequency changes may induce subtle sorts of jitter in a
> > + workload. Cpufreq should be used and tuned with caution.
> > +
> > +- Deep C-states may result in latency issues upon wake-up. If this
> > + happens to be a problem, C-states can be limited via kernel boot
> > + parameters such as processor.max_cstate or intel_idle.max_cstate.
> > +
>
> Nitpickery again, I know it's not an exhaustive listing, but I'd rather
> point to the sysfs cpuidle interface (or just mention it too), since that
> means deep C-states can be left enabled for HK CPUs.
Yes!
>
>
> Should we also mention BIOS/firmware fuckery like SMIs?
>
> """
> - Your system may be subject to firmware-originating interrupts - x86 has
> System Management Interrupts (SMIs) for example. Check your system BIOS
> to disable such interference, and with some luck your vendor will have
> a BIOS tuning guidance for low-latency operations.
> """
Definetely!
>
> > +Debugging
> > +=========
> > +
> > +Of course things are never so easy, especially on this matter.
> > +Chances are that actual noise will be observed in the aforementioned
> > +trace.7 file.
> > +
> > +The best way to investigate further is to enable finer grained
> > +tracepoints such as those of subsystems producing asynchronous
> > +events: workqueue, timer, irq_vector, etc... It also can be
> > +interesting to enable the tick_stop event to diagnose why the tick is
> > +retained when that happens.
> > +
>
> I'd also list the 'ipi_send*' family, although that's emitted from the HK
> CPU, not the disturbed isolated CPU.
Yeah I can do that.
>
> > +Some tools may also be useful for higher level analysis:
> > +
> > +- :ref:`Documentation/tools/rtla/rtla-osnoise.rst <rtla-osnoise>` runs a kernel
> > + tracer that analyzes and output a
> > + summary of the noises.
> > +
>
> I'd want to point to hwnoise and timerlat as well, so maybe point to
> rtla.rst?
Good point.
Thanks!
>
> > +- dynticks-testing does something similar but in userspace. It is available
> > + at git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git
> > diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> > index 259d79fbeb94..b5f1fc7d5290 100644
> > --- a/Documentation/admin-guide/index.rst
> > +++ b/Documentation/admin-guide/index.rst
> > @@ -94,6 +94,7 @@ likely to be of interest on almost any system.
> >
> > cgroup-v2
> > cgroup-v1/index
> > + cpu-isolation
> > cpu-load
> > mm/index
> > module-signing
> > --
> > 2.50.1
>
--
Frederic Weisbecker
SUSE Labs
Powered by blists - more mailing lists