linux-kernel - Re: Question about num_possible_cpus() and cpu_possible

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZvpsikythXnvZ7V_@J2N7QTR9R3>
Date: Mon, 30 Sep 2024 10:16:58 +0100
From: Mark Rutland <mark.rutland@....com>
To: Michael Kelley <mhklinux@...look.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
	"peterz@...radead.org" <peterz@...radead.org>,
	Borislav Petkov <bp@...en8.de>, Yury Norov <yury.norov@...il.com>,
	"x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Question about num_possible_cpus() and cpu_possible_mask

On Wed, Sep 25, 2024 at 04:04:33AM +0000, Michael Kelley wrote:
> Question:  Is there any intention to guarantee that the cpu_possible_mask is
> "dense", in that all bit positions 0 thru (nr_cpu_ids - 1) are set, with no
> "holes"? If that were true, then num_possible_cpus() would be equal to
> nr_cpu_ids.
> 
> x86 always sets up cpu_possible_mask as dense, as does ARM64 with ACPI.
> But it appears there are errors cases on ARM64 with DeviceTree where this
> is not the case. I haven't looked at other architectures.
> 
> There's evidence both ways:
> 1) A somewhat recent report[1] on SPARC where cpu_possible_mask
>    isn't dense, and there's code assuming that it is dense. This report
>    got me thinking about the question.
>   
> 2) setup_nr_cpu_ids() in kernel/smp.c is coded to *not* assume it is dense
> 
> 3) But there are several places throughout the kernel that do something like
>    the following, which assumes they are dense:
> 
> 	array = kcalloc(num_possible_cpus(), sizeof(<some struct>), GFP_KERNEL);
> 	....
> 	index into "array" with smp_processor_id()
> 
> On balance, I'm assuming that there's no requirement for cpu_possible_mask
> to be dense, and code like #3 above is technically wrong. It should be
> using nr_cpu_ids instead of num_possible_cpus(), which is also faster.
> We get away with it 99.99% of the time because all (or almost all?)
> architectures populate cpu_possible_mask as dense.
> 
> There are 6 places in Hyper-V specific code that do #3. And it works because
> Hyper-V code only runs on x86 and ARM64 where cpu_possible_mask is
> always dense.

Maybe that happens be be true under Hyper-V, but in general
cpu_possible_mask is not always dense on arm64, and we've had the change
core code to handle that in the past, e.g.

  bc75e99983df1efd ("rcu: Correctly handle sparse possible cpu")

> But in the interest of correctness and robustness against
> future changes, I'm planning to fix the Hyper-V code.

To me, that sounds like the right thing to do.

Mark.