[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250715155810.514141-1-longman@redhat.com>
Date: Tue, 15 Jul 2025 11:58:10 -0400
From: Waiman Long <longman@...hat.com>
To: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>
Cc: linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org,
Chen Ridong <chenridong@...weicloud.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Koutný <mkoutny@...e.com>,
Waiman Long <longman@...hat.com>
Subject: [PATCH] sched/core: Mask out offline CPUs when user_cpus_ptr is used
Chen Ridong reported that cpuset could report a kernel warning for a task
due to set_cpus_allowed_ptr() returning failure in the corner case that:
1) the task used sched_setaffinity(2) to set its CPU affinity mask to
be the same as the cpuset.cpus of its cpuset,
2) all the CPUs assigned to that cpuset were taken offline, and
3) cpuset v1 is in use and the task had to be migrated to the top cpuset.
Due to the fact that CPU affinity of the tasks in the top cpuset are
not updated when a CPU hotplug online/offline event happens, offline
CPUs are included in CPU affinity of those tasks. It is possible
that further masking with user_cpus_ptr set by sched_setaffinity(2)
in __set_cpus_allowed_ptr() will leave only offline CPUs in the new
mask causing the subsequent call to __set_cpus_allowed_ptr_locked()
to return failure with an empty CPU affinity.
Fix this failure by masking out offline CPUs when user_cpus_ptr masking
has to be done and fall back to ignoring user_cpus_ptr if the resulting
cpumask is empty.
Reported-by: Chen Ridong <chenridong@...weicloud.com>
Closes: https://lore.kernel.org/lkml/20250714032311.3570157-1-chenridong@huaweicloud.com/
Fixes: da019032819a ("sched: Enforce user requested affinity")
Signed-off-by: Waiman Long <longman@...hat.com>
---
kernel/sched/core.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 81c6df746df1..4cf25dd8827f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3172,10 +3172,15 @@ int __set_cpus_allowed_ptr(struct task_struct *p, struct affinity_context *ctx)
/*
* Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
* flags are set.
+ *
+ * Even though the given new_mask must have at least one online CPU,
+ * masking with user_cpus_ptr may strip out all online CPUs causing
+ * failure. So offline CPUs have to be masked out too.
*/
if (p->user_cpus_ptr &&
!(ctx->flags & (SCA_USER | SCA_MIGRATE_ENABLE | SCA_MIGRATE_DISABLE)) &&
- cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
+ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr) &&
+ cpumask_and(rq->scratch_mask, rq->scratch_mask, cpu_active_mask))
ctx->new_mask = rq->scratch_mask;
return __set_cpus_allowed_ptr_locked(p, ctx, rq, &rf);
--
2.50.0
Powered by blists - more mailing lists