[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8cdeb85-2629-440d-9c11-69f6e19f8cb6@redhat.com>
Date: Sun, 31 Aug 2025 20:40:36 -0400
From: Waiman Long <llong@...hat.com>
To: Frederic Weisbecker <frederic@...nel.org>,
LKML <linux-kernel@...r.kernel.org>
Cc: Michal Koutný <mkoutny@...e.com>,
Ingo Molnar <mingo@...hat.com>, Johannes Weiner <hannes@...xchg.org>,
Marco Crivellari <marco.crivellari@...e.com>, Michal Hocko
<mhocko@...e.com>, Peter Zijlstra <peterz@...radead.org>,
Tejun Heo <tj@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>, cgroups@...r.kernel.org
Subject: Re: [PATCH 14/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset
On 8/29/25 11:47 AM, Frederic Weisbecker wrote:
> Until now, HK_TYPE_DOMAIN used to only include boot defined isolated
> CPUs passed through isolcpus= boot option. Users interested in also
> knowing the runtime defined isolated CPUs through cpuset must use
> different APIs: cpuset_cpu_is_isolated(), cpu_is_isolated(), etc...
>
> There are many drawbacks to that approach:
>
> 1) Most interested subsystems want to know about all isolated CPUs, not
> just those defined on boot time.
>
> 2) cpuset_cpu_is_isolated() / cpu_is_isolated() are not synchronized with
> concurrent cpuset changes.
>
> 3) Further cpuset modifications are not propagated to subsystems
>
> Solve 1) and 2) and centralize all isolated CPUs within the
> HK_TYPE_DOMAIN housekeeping cpumask.
>
> Subsystems can rely on RCU to synchronize against concurrent changes.
>
> The propagation mentioned in 3) will be handled in further patches.
>
> Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> ---
> include/linux/sched/isolation.h | 4 +-
> kernel/cgroup/cpuset.c | 2 +
> kernel/sched/isolation.c | 65 ++++++++++++++++++++++++++++++---
> kernel/sched/sched.h | 1 +
> 4 files changed, 65 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index 5ddb8dc5ca91..48f3b6b20604 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -23,16 +23,39 @@ EXPORT_SYMBOL_GPL(housekeeping_flags);
>
> bool housekeeping_enabled(enum hk_type type)
> {
> - return !!(housekeeping_flags & BIT(type));
> + return !!(READ_ONCE(housekeeping_flags) & BIT(type));
> }
> EXPORT_SYMBOL_GPL(housekeeping_enabled);
>
> +static bool housekeeping_dereference_check(enum hk_type type)
> +{
> + if (type == HK_TYPE_DOMAIN) {
> + if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held())
> + return true;
> + if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held())
> + return true;
> +
> + return false;
> + }
> +
> + return true;
> +}
Both lockdep_is_cpuset_held() and lockdep_is_cpus_write_held() may be
defined only if CONFIG_LOCKDEP is set. However, this function is
currently referenced by __housekeeping_cpumask() via RCU_LOCKDEP_WARN().
So it is not invoked if CONFIG_LOCKDEP is not set. You are assuming that
static function not referenced is not being compiled into the object
file. Should we bracket it with "ifdef CONFIG_LOCKDEP" just to make this
clear?
> +
> +static inline struct cpumask *__housekeeping_cpumask(enum hk_type type)
> +{
> + return rcu_dereference_check(housekeeping_cpumasks[type],
> + housekeeping_dereference_check(type));
> +}
> +
> const struct cpumask *housekeeping_cpumask(enum hk_type type)
> {
> - if (housekeeping_flags & BIT(type)) {
> - return rcu_dereference_check(housekeeping_cpumasks[type], 1);
> - }
> - return cpu_possible_mask;
> + const struct cpumask *mask = NULL;
> +
> + if (READ_ONCE(housekeeping_flags) & BIT(type))
> + mask = __housekeeping_cpumask(type);
> + if (!mask)
> + mask = cpu_possible_mask;
> + return mask;
> }
> EXPORT_SYMBOL_GPL(housekeeping_cpumask);
>
> @@ -70,12 +93,42 @@ EXPORT_SYMBOL_GPL(housekeeping_affine);
>
> bool housekeeping_test_cpu(int cpu, enum hk_type type)
> {
> - if (housekeeping_flags & BIT(type))
> + if (READ_ONCE(housekeeping_flags) & BIT(type))
> return cpumask_test_cpu(cpu, housekeeping_cpumask(type));
> return true;
> }
> EXPORT_SYMBOL_GPL(housekeeping_test_cpu);
>
> +int housekeeping_update(struct cpumask *mask, enum hk_type type)
> +{
> + struct cpumask *trial, *old = NULL;
> +
> + if (type != HK_TYPE_DOMAIN)
> + return -ENOTSUPP;
> +
> + trial = kmalloc(sizeof(*trial), GFP_KERNEL);
> + if (!trial)
> + return -ENOMEM;
> +
> + cpumask_andnot(trial, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT), mask);
> + if (!cpumask_intersects(trial, cpu_online_mask)) {
> + kfree(trial);
> + return -EINVAL;
> + }
> +
> + if (housekeeping_flags & BIT(type))
> + old = __housekeeping_cpumask(type);
> + else
> + WRITE_ONCE(housekeeping_flags, housekeeping_flags | BIT(type));
Should we use to READ_ONCE() to retrieve the current housekeeping_flags
value?
Cheers,
Longman
Powered by blists - more mailing lists