[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZfshDg9Vyfl6hqVG@yury-ThinkPad>
Date: Wed, 20 Mar 2024 10:46:54 -0700
From: Yury Norov <yury.norov@...il.com>
To: Rasmus Villemoes <linux@...musvillemoes.dk>
Cc: Dawei Li <dawei.li@...ngroup.cn>, andriy.shevchenko@...ux.intel.com,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cpumask: Create dedicated kmem cache for cpumask var
On Wed, Mar 20, 2024 at 10:03:02AM +0100, Rasmus Villemoes wrote:
> On 19/03/2024 13.24, Dawei Li wrote:
> > alloc_cpumask_var_node() and friends allocate cpumask var dynamically
> > for CONFIG_CPUMASK_OFFSTACK=y kernel. The allocated size of cpumask var
> > is cpumask_size(), which is runtime constant after nr_cpu_ids is
> > freezed.
> >
> > Create a dedicated kmem cache for dynamic allocation of cpumask var.
>
> Why?
Hi Dawei,
Agree with Rasmus. CPUMASK_OFFSTACK=y is quite a rare configuration,
normally disabled. Can you show the architecture that you're working
with and how the cache you're adding affects performance.
> > The window for creation of cache is somewhat narrow:
> > - After last update of nr_cpu_ids(via set_nr_cpu_ids())
> > - After kmem cache is available.
> > - Before any alloc_cpumask_var_node() invocations(sched_init() e.g).
Not only narrow but also not uniform across platforms. For example,
on XEN xen_smp_count_cpus() may adjust nr_cpu_ids. I don't think that
people runn XEN with CPUMASK_OFFSTACK=y, but you have to make sure
that your cache is always created before the 1st allocation.
> OK, so this sounds somewhat fragile. It's maybe correct, but I fail to
> see what is gained by this, and the commit message does not provide any
> hints.
Agree. To make it less vulnerable, you have to enforce something like:
bool cpumask_cache_used = false;
static inline void set_nr_cpu_ids(unsigned int nr)
{
if (WARN_ON(cpumask_cache_used))
return;
nr_cpu_ids = nr;
cpumask_cache_destroy();
cpumask_cache_init()
}
bool alloc_cpumask_var_node()
{
cpumask_cache_used = true;
*mask = kmalloc_node(cpumask_size(), flags, node);
...
}
But at the very first, we need to understand under which scenarios the
new cache would benefit performance?
Thnaks,
Yury
benefits performance
Powered by blists - more mailing lists