[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQtwbRrFBCUoQ2Yj@localhost.localdomain>
Date: Wed, 5 Nov 2025 16:42:37 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Waiman Long <llong@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Michal Koutný <mkoutny@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
Danilo Krummrich <dakr@...nel.org>,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Gabriele Monaco <gmonaco@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Ingo Molnar <mingo@...hat.com>, Jakub Kicinski <kuba@...nel.org>,
Jens Axboe <axboe@...nel.dk>, Johannes Weiner <hannes@...xchg.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Marco Crivellari <marco.crivellari@...e.com>,
Michal Hocko <mhocko@...e.com>, Muchun Song <muchun.song@...ux.dev>,
Paolo Abeni <pabeni@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Phil Auld <pauld@...hat.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeel.butt@...ux.dev>,
Simon Horman <horms@...nel.org>, Tejun Heo <tj@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>, Will Deacon <will@...nel.org>,
cgroups@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-block@...r.kernel.org, linux-mm@...ck.org,
linux-pci@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 13/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset
Le Tue, Oct 21, 2025 at 12:10:16AM -0400, Waiman Long a écrit :
> On 10/13/25 4:31 PM, Frederic Weisbecker wrote:
> > Until now, HK_TYPE_DOMAIN used to only include boot defined isolated
> > CPUs passed through isolcpus= boot option. Users interested in also
> > knowing the runtime defined isolated CPUs through cpuset must use
> > different APIs: cpuset_cpu_is_isolated(), cpu_is_isolated(), etc...
> >
> > There are many drawbacks to that approach:
> >
> > 1) Most interested subsystems want to know about all isolated CPUs, not
> > just those defined on boot time.
> >
> > 2) cpuset_cpu_is_isolated() / cpu_is_isolated() are not synchronized with
> > concurrent cpuset changes.
> >
> > 3) Further cpuset modifications are not propagated to subsystems
> >
> > Solve 1) and 2) and centralize all isolated CPUs within the
> > HK_TYPE_DOMAIN housekeeping cpumask.
> >
> > Subsystems can rely on RCU to synchronize against concurrent changes.
> >
> > The propagation mentioned in 3) will be handled in further patches.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > ---
> > include/linux/sched/isolation.h | 2 +
> > kernel/cgroup/cpuset.c | 2 +
> > kernel/sched/isolation.c | 75 ++++++++++++++++++++++++++++++---
> > kernel/sched/sched.h | 1 +
> > 4 files changed, 74 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
> > index da22b038942a..94d5c835121b 100644
> > --- a/include/linux/sched/isolation.h
> > +++ b/include/linux/sched/isolation.h
> > @@ -32,6 +32,7 @@ extern const struct cpumask *housekeeping_cpumask(enum hk_type type);
> > extern bool housekeeping_enabled(enum hk_type type);
> > extern void housekeeping_affine(struct task_struct *t, enum hk_type type);
> > extern bool housekeeping_test_cpu(int cpu, enum hk_type type);
> > +extern int housekeeping_update(struct cpumask *mask, enum hk_type type);
> > extern void __init housekeeping_init(void);
> > #else
> > @@ -59,6 +60,7 @@ static inline bool housekeeping_test_cpu(int cpu, enum hk_type type)
> > return true;
> > }
> > +static inline int housekeeping_update(struct cpumask *mask, enum hk_type type) { return 0; }
> > static inline void housekeeping_init(void) { }
> > #endif /* CONFIG_CPU_ISOLATION */
> > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> > index aa1ac7bcf2ea..b04a4242f2fa 100644
> > --- a/kernel/cgroup/cpuset.c
> > +++ b/kernel/cgroup/cpuset.c
> > @@ -1403,6 +1403,8 @@ static void update_unbound_workqueue_cpumask(bool isolcpus_updated)
> > ret = workqueue_unbound_exclude_cpumask(isolated_cpus);
> > WARN_ON_ONCE(ret < 0);
> > + ret = housekeeping_update(isolated_cpus, HK_TYPE_DOMAIN);
> > + WARN_ON_ONCE(ret < 0);
> > }
> > /**
> > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > index b46c20b5437f..95d69c2102f6 100644
> > --- a/kernel/sched/isolation.c
> > +++ b/kernel/sched/isolation.c
> > @@ -29,18 +29,48 @@ static struct housekeeping housekeeping;
> > bool housekeeping_enabled(enum hk_type type)
> > {
> > - return !!(housekeeping.flags & BIT(type));
> > + return !!(READ_ONCE(housekeeping.flags) & BIT(type));
> > }
> > EXPORT_SYMBOL_GPL(housekeeping_enabled);
> > +static bool housekeeping_dereference_check(enum hk_type type)
> > +{
> > + if (IS_ENABLED(CONFIG_LOCKDEP) && type == HK_TYPE_DOMAIN) {
> > + /* Cpuset isn't even writable yet? */
> > + if (system_state <= SYSTEM_SCHEDULING)
> > + return true;
> > +
> > + /* CPU hotplug write locked, so cpuset partition can't be overwritten */
> > + if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held())
> > + return true;
> > +
> > + /* Cpuset lock held, partitions not writable */
> > + if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held())
> > + return true;
>
> I have some doubt about this condition as the cpuset_mutex may be held in
> the process of making changes to an isolated partition that will impact
> HK_TYPE_DOMAIN cpumask.
Indeed and therefore if the current process is holding the cpuset mutex,
it is guaranteed that no other process will update the housekeeping cpumask
concurrently.
So the housekeeping mask is guaranteed to be stable, right? Of course
the current task may be changing it but while it is changing it, it is
not reading it.
Thanks.
>
> Cheers,
> Longman
>
--
Frederic Weisbecker
SUSE Labs
Powered by blists - more mailing lists