lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250806065707.3533345-1-qiuxu.zhuo@intel.com>
Date: Wed,  6 Aug 2025 14:57:07 +0800
From: Qiuxu Zhuo <qiuxu.zhuo@...el.com>
To: Tony Luck <tony.luck@...el.com>,
	Borislav Petkov <bp@...en8.de>
Cc: Qiuxu Zhuo <qiuxu.zhuo@...el.com>,
	Jose Jesus Ambriz Meza <jose.jesus.ambriz.meza@...el.com>,
	Chia-Lin Kao <acelan.kao@...onical.com>,
	James Morse <james.morse@....com>,
	Mauro Carvalho Chehab <mchehab@...nel.org>,
	Robert Richter <rric@...nel.org>,
	Lai Yi <yi1.lai@...ux.intel.com>,
	linux-edac@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: [PATCH 1/1] EDAC/i10nm: Skip DIMM enumeration on a disabled memory controller

When loading the i10nm_edac driver on some Intel Granite Rapids servers,
a call trace may appear as follows:

  UBSAN: shift-out-of-bounds in drivers/edac/skx_common.c:453:16
  shift exponent -66 is negative
  ...
  __ubsan_handle_shift_out_of_bounds+0x1e3/0x390
  skx_get_dimm_info.cold+0x47/0xd40 [skx_edac_common]
  i10nm_get_dimm_config+0x23e/0x390 [i10nm_edac]
  skx_register_mci+0x159/0x220 [skx_edac_common]
  i10nm_init+0xcb0/0x1ff0 [i10nm_edac]
  ...

This occurs because some BIOS may disable a memory controller if there
aren't any memory DIMMs populated on this memory controller. The DIMMMTR
register of this disabled memory controller contains the invalid value
~0, resulting in the call trace above.

Fix this call trace by skipping DIMM enumeration on a disabled memory
controller.

Fixes: ba987eaaabf9 ("EDAC/i10nm: Add Intel Granite Rapids server support")
Reported-by: Jose Jesus Ambriz Meza <jose.jesus.ambriz.meza@...el.com>
Reported-by: Chia-Lin Kao (AceLan) <acelan.kao@...onical.com>
Closes: https://lore.kernel.org/all/20250730063155.2612379-1-acelan.kao@canonical.com/
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@...onical.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@...el.com>
---
 drivers/edac/i10nm_base.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c
index a3fca2567752..ac9afcf461d7 100644
--- a/drivers/edac/i10nm_base.c
+++ b/drivers/edac/i10nm_base.c
@@ -1047,6 +1047,15 @@ static bool i10nm_check_ecc(struct skx_imc *imc, int chan)
 	return !!GET_BITFIELD(mcmtr, 2, 2);
 }
 
+static bool i10nm_channel_disabled(struct skx_imc *imc, int chan)
+{
+	u32 mcmtr = I10NM_GET_MCMTR(imc, chan);
+
+	edac_dbg(1, "mc%d ch%d mcmtr reg %x\n", imc->mc, chan, mcmtr);
+
+	return (mcmtr == ~0 || GET_BITFIELD(mcmtr, 18, 18));
+}
+
 static int i10nm_get_dimm_config(struct mem_ctl_info *mci,
 				 struct res_config *cfg)
 {
@@ -1060,6 +1069,11 @@ static int i10nm_get_dimm_config(struct mem_ctl_info *mci,
 		if (!imc->mbase)
 			continue;
 
+		if (i10nm_channel_disabled(imc, i)) {
+			edac_dbg(1, "mc%d ch%d is disabled.\n", imc->mc, i);
+			continue;
+		}
+
 		ndimms = 0;
 
 		if (res_cfg->type != GNR)

base-commit: 038d61fd642278bab63ee8ef722c50d10ab01e8f
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ