[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240120025053.684838-1-yury.norov@gmail.com>
Date: Fri, 19 Jan 2024 18:50:44 -0800
From: Yury Norov <yury.norov@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ming Lei <ming.lei@...hat.com>,
linux-kernel@...r.kernel.org
Cc: Yury Norov <yury.norov@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Breno Leitao <leitao@...ian.org>,
Nathan Chancellor <nathan@...nel.org>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Zi Yan <ziy@...dia.com>
Subject: [PATCH v5 0/9] lib/group_cpus: rework grp_spread_init_one() and make it O(1)
grp_spread_init_one() implementation is sub-optimal because it
traverses bitmaps from the beginning, instead of picking from the
previous iteration.
Fix it and use find_bit API where appropriate. While here, optimize
cpumasks allocation and drop unneeded cpumask_empty() call.
---
v1: https://lore.kernel.org/all/ZW5MI3rKQueLM0Bz@yury-ThinkPad/T/
v2: https://lore.kernel.org/lkml/ZXKNVRu3AfvjaFhK@fedora/T/
v3: https://lore.kernel.org/lkml/20231212042108.682072-7-yury.norov@gmail.com/T/
v4: https://lore.kernel.org/lkml/20231228200936.2475595-1-yury.norov@gmail.com/T/
v5: add CPUMASK_NULL macro and use it to initialize cpumask_var_t
variables properly.
On cpumask_var_t initialization issue:
The idea of having different types behind the same typedef has been
considered nasty for quite a while. See a comment in include/linux/cpumask.h
for example.
Now that I'm trying to adopt kernel cleanup machinery to cpumasks, it
reveals another disadvantage of this approach - there's no way to assign
a cpumask_var_t variable at declaration time, which is required by
cleanup implementation.
To fix that, in v5 I added a CPUMASK_NULL macro as a workaround. This
CPUMASK_NULL would be also useful for those converting existing codebase
to enable cleanup variables.
On a long term, it's better to drop CPUMASK_OFFSTACK entirely. Moreover,
it's used only on Power and x86 machines if NR_CPUS >= 8K (unless people
enable it explicitly, and nobody bothers doing that in a real life). But
it requires some more discussions with Power and x64 people...
Meanwhile, I'm going to submit a patchset that deprecates cpumask_var_t,
and adds a new set of allocators which would support initialization at
declaration time.
Yury Norov (9):
cpumask: introduce for_each_cpu_and_from()
lib/group_cpus: optimize inner loop in grp_spread_init_one()
lib/group_cpus: relax atomicity requirement in grp_spread_init_one()
lib/group_cpus: optimize outer loop in grp_spread_init_one()
lib/group_cpus: don't zero cpumasks in group_cpus_evenly() on
allocation
lib/group_cpus: drop unneeded cpumask_empty() call in
__group_cpus_evenly()
cpumask: define cleanup function for cpumasks
lib/group_cpus: rework group_cpus_evenly()
lib/group_cpus: simplify group_cpus_evenly() for more
include/linux/cpumask.h | 16 ++++++
include/linux/find.h | 3 ++
lib/group_cpus.c | 110 ++++++++++++++++------------------------
3 files changed, 62 insertions(+), 67 deletions(-)
--
2.40.1
Powered by blists - more mailing lists