[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB608340E81A15F20EBAD75F08FC24A@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Wed, 28 Jun 2023 16:39:46 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Koba Ko <koba.ko@...onical.com>, Borislav Petkov <bp@...en8.de>,
"James Morse" <james.morse@....com>,
Mauro Carvalho Chehab <mchehab@...nel.org>,
Robert Richter <rric@...nel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] EDAC/i10nm: shift exponent is negative
> ranks = numrank(mtr);
> rows = numrow(mtr);
> cols = imc->hbm_mc ? 6 : numcol(mtr);
> + if (ranks == -EINVAL || rows == -EINVAL || cols == -EINVAL)
> + return 0;
This seems to be just hiding the real problem that a DIMM was found
with some number of ranks, rows, or columns that the EDAC driver
didn't expect to see. Your fix makes the driver skip over this DIMM.
Can you build your kernel with CONFIG_EDAC_DEBUG=y and see
what messages you get from this code:
static int skx_get_dimm_attr(u32 reg, int lobit, int hibit, int add,
int minval, int maxval, const char *name)
{
u32 val = GET_BITFIELD(reg, lobit, hibit);
if (val < minval || val > maxval) {
edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);
return -EINVAL;
}
-Tony
Powered by blists - more mailing lists