[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240930075600.GC5594@noisy.programming.kicks-ass.net>
Date: Mon, 30 Sep 2024 09:56:00 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Michael Kelley <mhklinux@...look.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>,
Yury Norov <yury.norov@...il.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Question about num_possible_cpus() and cpu_possible_mask
On Wed, Sep 25, 2024 at 04:04:33AM +0000, Michael Kelley wrote:
> Question: Is there any intention to guarantee that the cpu_possible_mask is
> "dense", in that all bit positions 0 thru (nr_cpu_ids - 1) are set, with no
> "holes"? If that were true, then num_possible_cpus() would be equal to
> nr_cpu_ids.
I think we've historically had machines where there were holes in. And I
think we're wanting to have holes in for modern hybrid x86 that have HT,
although I'm not entirely sure where those patches are atm.
Thomas, didn't we have a patch that renumbers CPUs for hybrid crud sich
that HT is always the low bit and we end up with holes because the atoms
don't have HT on?
Or was that on my plate and it got lost in the giant todo pile?
> x86 always sets up cpu_possible_mask as dense, as does ARM64 with ACPI.
> But it appears there are errors cases on ARM64 with DeviceTree where this
> is not the case. I haven't looked at other architectures.
>
> There's evidence both ways:
> 1) A somewhat recent report[1] on SPARC where cpu_possible_mask
> isn't dense, and there's code assuming that it is dense. This report
> got me thinking about the question.
>
> 2) setup_nr_cpu_ids() in kernel/smp.c is coded to *not* assume it is dense
>
> 3) But there are several places throughout the kernel that do something like
> the following, which assumes they are dense:
>
> array = kcalloc(num_possible_cpus(), sizeof(<some struct>), GFP_KERNEL);
> ....
> index into "array" with smp_processor_id()
I would consider this pattern broken.
> On balance, I'm assuming that there's no requirement for cpu_possible_mask
> to be dense, and code like #3 above is technically wrong. It should be
> using nr_cpu_ids instead of num_possible_cpus(), which is also faster.
> We get away with it 99.99% of the time because all (or almost all?)
> architectures populate cpu_possible_mask as dense.
>
> There are 6 places in Hyper-V specific code that do #3. And it works because
> Hyper-V code only runs on x86 and ARM64 where cpu_possible_mask is
> always dense. But in the interest of correctness and robustness against
> future changes, I'm planning to fix the Hyper-V code.
>
> There are also a few other places throughout the kernel with the same
> problem, and I may look at fixing those as well.
>
> Or maybe my assumptions are off-base. Any thoughts or guidance before
> I start submitting patches?
You're on the right track, should not assume the mask is dense.
Powered by blists - more mailing lists