[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aV4r3O_Xr5Q3qwvI@jlelli-thinkpadt14gen4.remote.csb>
Date: Wed, 7 Jan 2026 10:48:12 +0100
From: Juri Lelli <juri.lelli@...hat.com>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: Shrikanth Hegde <sshegde@...ux.ibm.com>, neelx@...e.com, sean@...e.io,
mproche@...il.com, linux-kernel@...r.kernel.org, mingo@...hat.com,
peterz@...radead.org, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com
Subject: Re: [RFC PATCH 0/1] sched/fair: Feature to suppress Fair Server for
NOHZ_FULL isolation
Hello!
On 06/01/26 09:49, Aaron Tomlin wrote:
> On Tue, Jan 06, 2026 at 02:37:49PM +0530, Shrikanth Hegde wrote:
> > If all your SCHED_FIFO is pinned and their scheduling decisions
> > are managed in userspace, using isolcpus would offer you better
> > isolations compared to nohz_full.
>
> Hi Shrikanth,
>
> You are entirely correct; isolcpus=domain (or isolcpus= without flags as
> per housekeeping_isolcpus_setup()) indeed offers superior isolation by
> removing the CPU from the scheduler load-balancing domains.
>
> I must apologise for the omission in my previous correspondence. I
> neglected to mention that our specific configuration utilises isolcpus= in
> conjunction with nohz_full=.
>
> > > However, the extant "Fair Server" (Deadline Server) architecture
> > > compromises this isolation guarantee. At present, should a background
> > > SCHED_OTHER task be enqueued, the scheduler initiates the Fair Server
> > > (dl_server_start). As the Fair Server functions as a SCHED_DEADLINE entity,
> > > its activation increments rq->dl.dl_nr_running.
> > >
> >
> > There is runtime allocated to fair server. If you make them 0 on CPUs of
> > interest, wouldn't that work?
> >
> > /sys/kernel/debug/sched/fair_server/<cpu>/runtime
>
> Yes, you are quite right; setting the fair server runtime to 0 (via
> /sys/kernel/debug/sched/fair_server/[cpu]/runtime) does indeed achieve the
> desired effect. In my testing, the SCHED_FIFO task on the fully
> adaptive-tick CPU remains uninterrupted by the restored clock-tick when
> this configuration is applied. Thank you.
>
> However, I believe it would be beneficial if this scheduling feature were
> available as an automatic kernel detection mechanism. While the manual
> runtime adjustment works, having the kernel automatically detect the
> condition - where an RT task is running and bandwidth enforcement is
> disabled - would provide a more seamless and robust solution for
> partitioned systems without requiring external intervention.
> I may consider an improved version of the patch that includes a "Fair
> server disabled" warning much like in sched_fair_server_write().
I am not sure either we need/want the automatic mechanism, as we already
have the fair_server interface. I kind of think that if any (kthread
included) CFS task is enqueued on an "isolated" CPU the problem might
reside in sub-optimal isolation (usually a config issue or a kernel
issue that might need solving - e.g. a for_each_cpu loop that needs
changing). Starving such tasks might anyway end in a system crash of
sort.
Thanks,
Juri
Powered by blists - more mailing lists