linux-kernel - RE: [PATCH 0/5] hyper-v: Don't assume cpu_possible

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB415740B41A34B1468BC6AE28D43D2@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Tue, 10 Dec 2024 19:58:34 +0000
From: Michael Kelley <mhklinux@...look.com>
To: "wei.liu@...nel.org" <wei.liu@...nel.org>
CC: "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "linux-hyperv@...r.kernel.org"
	<linux-hyperv@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-scsi@...r.kernel.org"
	<linux-scsi@...r.kernel.org>, Michael Kelley <mhklinux@...look.com>,
	"kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
	<haiyangz@...rosoft.com>, "decui@...rosoft.com" <decui@...rosoft.com>,
	"tglx@...utronix.de" <tglx@...utronix.de>, "mingo@...hat.com"
	<mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "x86@...nel.org"
	<x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>, "joro@...tes.org"
	<joro@...tes.org>, "will@...nel.org" <will@...nel.org>,
	"robin.murphy@....com" <robin.murphy@....com>, "davem@...emloft.net"
	<davem@...emloft.net>, "edumazet@...gle.com" <edumazet@...gle.com>,
	"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
	"James.Bottomley@...senPartnership.com"
	<James.Bottomley@...senPartnership.com>, "martin.petersen@...cle.com"
	<martin.petersen@...cle.com>
Subject: RE: [PATCH 0/5] hyper-v: Don't assume cpu_possible_mask is dense

From: mhkelley58@...il.com <mhkelley58@...il.com> Sent: Wednesday, October 2, 2024 8:53 PM
> 
> Code specific to Hyper-V guests currently assumes the cpu_possible_mask
> is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set,
> with no "holes". Therefore, num_possible_cpus() is assumed to be equal
> to nr_cpu_ids.
> 
> Per a separate discussion[1], this assumption is not valid in the
> general case. For example, the function setup_nr_cpu_ids() in
> kernel/smp.c is coded to assume cpu_possible_mask may be sparse,
> and other patches have been made in the past to correctly handle
> the sparseness. See bc75e99983df1efd ("rcu: Correctly handle sparse
> possible cpu") as noted by Mark Rutland.
> 
> The general case notwithstanding, the configurations that Hyper-V
> provides to guest VMs on x86 and ARM64 hardware, in combination
> with the algorithms currently used by architecture specific code
> to assign Linux CPU numbers, *does* always produce a dense
> cpu_possible_mask. So the invalid assumption is not currently
> causing failures. But in the interest of correctness, and robustness
> against future changes in the code that populates cpu_possible_mask,
> update the Hyper-V code to no longer assume denseness.
> 
> The typical code pattern with the invalid assumption is as follows:
> 
> 	array = kcalloc(num_possible_cpus(), sizeof(<some struct>),
> 			GFP_KERNEL);
> 	....
> 	index into "array" with smp_processor_id()
> 
> In such as case, the array might be indexed by a value beyond the size
> of the array. The correct approach is to allocate the array with size
> "nr_cpu_ids". While this will probably leave unused any array entries
> corresponding to holes in cpu_possible_mask, the holes are assumed to
> be minimal and hence the amount of memory wasted by unused entries is
> minimal.
> 
> Removing the assumption in Hyper-V code is done in several patches
> because they touch different kernel subsystems:
> 
> Patch 1: Hyper-V x86 initialization of hv_vp_assist_page (there's no
> 	 hv_vp_assist_page on ARM64)
> Patch 2: Hyper-V common init of hv_vp_index
> Patch 3: Hyper-V IOMMU driver
> Patch 4: storvsc driver
> Patch 5: netvsc driver

Wei --

Could you pick up Patches 1, 2, and 3 in this series for the hyperv-next
tree? Peter Zijlstra acked the full series [2], and Patches 4 and 5 have
already been picked by the SCSI and net maintainers respectively [3][4].

Let me know if you have any concerns.

Thanks,

Michael

[2] https://lore.kernel.org/linux-hyperv/20241004100742.GO18071@noisy.programming.kicks-ass.net/
[3] https://lore.kernel.org/linux-hyperv/yq15xnsjlc1.fsf@ca-mkp.ca.oracle.com/
[4] https://lore.kernel.org/linux-hyperv/172808404024.2772330.2975585273609596688.git-patchwork-notify@kernel.org/

> 
> I tested the changes by hacking the construction of cpu_possible_mask
> to include a hole on x86. With a configuration set to demonstrate the
> problem, a Hyper-V guest kernel eventually crashes due to memory
> corruption. After the patches in this series, the crash does not occur.
> 
> [1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/
> 
> Michael Kelley (5):
>   x86/hyperv: Don't assume cpu_possible_mask is dense
>   Drivers: hv: Don't assume cpu_possible_mask is dense
>   iommu/hyper-v: Don't assume cpu_possible_mask is dense
>   scsi: storvsc: Don't assume cpu_possible_mask is dense
>   hv_netvsc: Don't assume cpu_possible_mask is dense
> 
>  arch/x86/hyperv/hv_init.c       |  2 +-
>  drivers/hv/hv_common.c          |  4 ++--
>  drivers/iommu/hyperv-iommu.c    |  4 ++--
>  drivers/net/hyperv/netvsc_drv.c |  2 +-
>  drivers/scsi/storvsc_drv.c      | 13 ++++++-------
>  5 files changed, 12 insertions(+), 13 deletions(-)
> 
> --
> 2.25.1
>