[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <zmjr43kk2m52huk2vvetvwefil7waletzuijiu5y34v3n4slgi@3wdtd3xckx7m>
Date: Mon, 12 Jan 2026 20:43:49 -0500
From: Aaron Tomlin <atomlin@...mlin.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>,
Shrikanth Hegde <sshegde@...ux.ibm.com>, neelx@...e.com, sean@...e.io, mproche@...il.com,
linux-kernel@...r.kernel.org, mingo@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com
Subject: Re: [RFC PATCH 0/1] sched/fair: Feature to suppress Fair Server for
NOHZ_FULL isolation
On Wed, Jan 07, 2026 at 11:26:59AM +0100, Peter Zijlstra wrote:
> We must not starve fair tasks -- this can severely affect the system
> health.
>
> Specifically per-cpu kthreads getting starved can cause complete system
> lockup when other CPUs go wait for completion and such.
>
> We must not disable the fair server, ever. Doing do means you get to
> keep the pieces.
>
> The only sane way is to ensure these tasks do not get queued in the
> first place.
Hi Peter,
To your point, in an effort to steer CFS (SCHED_NORMAL) tasks away from
isolated, RT-busy CPUs, I would be interested in your thoughts on the
following approach. By redirecting these "leaked" CFS tasks to housekeeping
CPUs prior to enqueueing, we ensure that rq->cfs.h_nr_queued remains at
zero on the isolated core. This prevents the activation of the Fair Server
and preserves the silence of the adaptive-tick mode.
While a race condition exists - specifically, an RT task could wake up on the
target CPU after our check returns false - this is likely acceptable. Should
an RT task wake up later, it will preempt the CFS task regardless;
consequently, the next time the CFS task sleeps and wakes, the logic will
intercept and redirect it, I think.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index da46c3164537..3db7a590a24d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8526,6 +8526,32 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
/* SD_flags and WF_flags share the first nibble */
int sd_flag = wake_flags & 0xF;
+ /*
+ * When RT_SUPPRESS_FAIR_SERVER is enabled, we proactively steer CFS tasks
+ * away from isolated CPUs that are currently executing Real-Time tasks.
+ *
+ * Enqueuing a CFS task on such a CPU would trigger dl_server_start(),
+ * which in turn restarts the tick to enforce bandwidth control. By
+ * redirecting the task to a housekeeping CPU during the selection
+ * phase, we preserve strict isolation and silence on the target CPU.
+ */
+#if defined(CONFIG_NO_HZ_FULL)
+ if (sched_feat(RT_SUPPRESS_FAIR_SERVER) && !rt_bandwidth_enabled()
+ && housekeeping_enabled(HK_TYPE_KERNEL_NOISE)) {
+ struct rq *target_rq = cpu_rq(prev_cpu);
+ /*
+ * Use READ_ONCE() to safely load the remote CPU's current task
+ * pointer without holding the rq lock.
+ */
+ struct task_struct *curr = READ_ONCE(target_rq->curr);
+
+ /* If the target CPU is isolated and busy with RT, redirect */
+ if (rt_task(curr) &&
+ !housekeeping_test_cpu(prev_cpu, HK_TYPE_KERNEL_NOISE)) {
+ return housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE);
+ }
+ }
+#endif
/*
* required for stable ->cpus_allowed
*/
--
Aaron Tomlin
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists