[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87ms5kxayd.ffs@tglx>
Date: Tue, 21 Oct 2025 22:21:30 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Yury Norov <yury.norov@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Peter Zijlstra
<peterz@...radead.org>, Gabriele Monaco <gmonaco@...hat.com>, Mathieu
Desnoyers <mathieu.desnoyers@...icios.com>, Michael Jeanson
<mjeanson@...icios.com>, Jens Axboe <axboe@...nel.dk>, "Paul E. McKenney"
<paulmck@...nel.org>, "Gautham R. Shenoy" <gautham.shenoy@....com>,
Florian Weimer <fweimer@...hat.com>, Tim Chen <tim.c.chen@...el.com>,
TCMalloc Team <tcmalloc-eng@...gle.com>
Subject: Re: [patch 07/19] cpumask: Introduce cpumask_or_weight()
Yury!
On Wed, Oct 15 2025 at 14:06, Yury Norov wrote:
> On Wed, Oct 15, 2025 at 01:41:50PM -0400, Yury Norov wrote:
> Ok, I see now. You want to do a regular cpumask_or(), but return the
> hweight() of the result, instead of a boolean.
>
> The cpumask_or_weight() may be really confused with cpumask_weight_or().
> Can you try considering a different naming? (I am seemingly can't.)
the only thing I came up with was cpumask_or_and_weight(), but that
sounded odd too. cpumask_or_and_calc_weight() perhaps.
> Can you describe the performance impact you've mentioned in the commit
> message in more details?
It's sparing the second loop with the related memory reads. It's about
10-20% faster for a 4k CPU mask (64 iterations) depending on the machine
I test on.
As this is invoked with runqueue lock held, there is definitely a desire
to spare as much cycles as possible.
Thanks,
tglx
Powered by blists - more mailing lists