[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB60838660FB93BD26F4C824A5FC25A@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Thu, 29 Jun 2023 16:11:58 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>,
Koba Ko <koba.ko@...onical.com>
CC: Borislav Petkov <bp@...en8.de>, James Morse <james.morse@....com>,
"Mauro Carvalho Chehab" <mchehab@...nel.org>,
Robert Richter <rric@...nel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] EDAC/i10nm: shift exponent is negative
> I don't agree with simply skipping over a DIMM even EDAC doesn't expect to see it.
> As the EDAC driver can still report errors for this DIMM once there are errors that occur in this DIMM.
>
> As per Tony's suggestion, could you test your kernel with CONFIG_EDAC_DEBUG=y and see the result?
>
> @Luck, Tony, Perhaps we may turn the debug print
>
> edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);
>
> to an error-print explicitly
>
> skx_printk(KERN_ERR, "bad %s = %d (raw=0x%x)\n", name, val, reg);
>
> Let the user have the chance to notice there is a DIMM that EDAC doesn't expect to see.
We need both. Changing that debug message to a real error message will let the user
know that EDAC doesn't recognize this DIMM (and will provide the information for you
or me to fix the driver).
But we also need Ko's fix - because it makes no sense to just use that negative shift
and pretend that EDAC knows how to handle this DIMM.
-Tony
Powered by blists - more mailing lists