lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CY8PR11MB713480F6D06D11BC4742AA0E8925A@CY8PR11MB7134.namprd11.prod.outlook.com>
Date:   Thu, 29 Jun 2023 09:58:54 +0000
From:   "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>
To:     Koba Ko <koba.ko@...onical.com>, "Luck, Tony" <tony.luck@...el.com>
CC:     Borislav Petkov <bp@...en8.de>, James Morse <james.morse@....com>,
        "Mauro Carvalho Chehab" <mchehab@...nel.org>,
        Robert Richter <rric@...nel.org>,
        "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] EDAC/i10nm: shift exponent is negative

Hi Ko,

I don't agree with simply skipping over a DIMM even EDAC doesn't expect to see it. 
As the EDAC driver can still report errors for this DIMM once there are errors that occur in this DIMM.  

As per Tony's suggestion, could you test your kernel with CONFIG_EDAC_DEBUG=y and see the result?
 
@Luck, Tony, Perhaps we may turn the debug print
     
       edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);

to an error-print explicitly

     skx_printk(KERN_ERR, "bad %s = %d (raw=0x%x)\n", name, val, reg);

Let the user have the chance to notice there is a DIMM that EDAC doesn't expect to see.

- Qiuxu

> From: Koba Ko <koba.ko@...onical.com>
> Sent: Thursday, June 29, 2023 11:53 AM
> To: Luck, Tony <tony.luck@...el.com>
> Cc: Borislav Petkov <bp@...en8.de>; James Morse <james.morse@....com>;
> Mauro Carvalho Chehab <mchehab@...nel.org>; Robert Richter
> <rric@...nel.org>; linux-edac@...r.kernel.org; linux-kernel@...r.kernel.org
> Subject: Re: [PATCH] EDAC/i10nm: shift exponent is negative
> 
> hi Luck,
> I agree with your points
> is it expected to shift with negative?
> 
> Thanks
> Koba Ko
> 
> On Thu, Jun 29, 2023 at 12:41 AM Luck, Tony <tony.luck@...el.com> wrote:
> >
> > >       ranks = numrank(mtr);
> > >       rows = numrow(mtr);
> > >       cols = imc->hbm_mc ? 6 : numcol(mtr);
> > > +     if (ranks == -EINVAL || rows == -EINVAL || cols == -EINVAL)
> > > +             return 0;
> >
> > This seems to be just hiding the real problem that a DIMM was found
> > with some number of ranks, rows, or columns that the EDAC driver
> > didn't expect to see. Your fix makes the driver skip over this DIMM.
> >
> > Can you build your kernel with CONFIG_EDAC_DEBUG=y and see what
> > messages you get from this code:
> >
> > static int skx_get_dimm_attr(u32 reg, int lobit, int hibit, int add,
> >                              int minval, int maxval, const char *name)
> > {
> >         u32 val = GET_BITFIELD(reg, lobit, hibit);
> >
> >         if (val < minval || val > maxval) {
> >                 edac_dbg(2, "bad %s = %d (raw=0x%x)\n", name, val, reg);
> >                 return -EINVAL;
> >         }
> >
> > -Tony
> >
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ