[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250714115915.GU905792@noisy.programming.kicks-ass.net>
Date: Mon, 14 Jul 2025 13:59:15 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Chen Ridong <chenridong@...weicloud.com>
Cc: longman@...hat.com, tj@...nel.org, hannes@...xchg.org, mkoutny@...e.com,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
lujialin4@...wei.com, chenridong@...wei.com
Subject: Re: [PATCH next] cpuset: fix warning when attaching tasks with
offline CPUs
On Mon, Jul 14, 2025 at 07:30:39PM +0800, Chen Ridong wrote:
> >> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> >> index f74d04429a29..5401adbdffa6 100644
> >> --- a/kernel/cgroup/cpuset.c
> >> +++ b/kernel/cgroup/cpuset.c
> >> @@ -3121,7 +3121,7 @@ static void cpuset_attach_task(struct cpuset *cs, struct task_struct *task)
> >> if (cs != &top_cpuset)
> >> guarantee_active_cpus(task, cpus_attach);
> >> else
> >> - cpumask_andnot(cpus_attach, task_cpu_possible_mask(task),
> >> + cpumask_andnot(cpus_attach, cpu_active_mask,
> >> subpartitions_cpus);
> >
> > This breaks things. Any task mask must be a subset of
> > task_cpu_possible_mask() at all times. It might not be able to run
> > outside of that mask.
>
> Hi Peter,
>
> Thanks for your feedback. I'm afraid I don't fully understand what you
> mean by "breaks things". Could you please explain in more detail?
>
> To clarify my current understanding: this patch simply changes the
> cpus_attach initialization from task_cpu_possible_mask(task) to
> cpu_active_mask. The intention is that when CPUs are offlined and
> tasks get migrated to root cpuset, we shouldn't try to migrate tasks
> to offline CPUs. And since cpu_active_mask is a subset of
> cpu_possible_mask, I thought this would be safe. Did I miss anything?
task_cpu_possible_mask() is the mask a task *MUST* stay inside of.
Specifically, this was introduced for ARMv9 where some CPUs drop the
capability to run ARM32 instructions. Trying to schedule an ARM32 task
on a CPU that does not support that instruction set is an immediate and
fatal fail.
Your change results in in something akin to:
set_cpus_allowed_task(task, cpu_active_mask & ~subpartition_cpus);
Which does not honor the task_cpu_possible_mask() constraint.
Powered by blists - more mailing lists