lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR12MB2630ED1425A3F01727E1C45BF8620@BYAPR12MB2630.namprd12.prod.outlook.com>
Date:   Fri, 1 Nov 2019 15:19:36 +0000
From:   "Ghannam, Yazen" <Yazen.Ghannam@....com>
To:     Borislav Petkov <bp@...en8.de>
CC:     "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2 0/6] AMD64 EDAC: Check for nodes without memory, etc.

> -----Original Message-----
> From: Borislav Petkov <bp@...en8.de>
> Sent: Friday, October 25, 2019 9:35 AM
> To: Ghannam, Yazen <Yazen.Ghannam@....com>
> Cc: linux-edac@...r.kernel.org; linux-kernel@...r.kernel.org
> Subject: Re: [PATCH v2 0/6] AMD64 EDAC: Check for nodes without memory, etc.
> 
> On Tue, Oct 22, 2019 at 08:35:08PM +0000, Ghannam, Yazen wrote:
> > From: Yazen Ghannam <yazen.ghannam@....com>
> >
> > Hi Boris,
> >
> > Most of these patches address the issue where the module checks and
> > complains about DRAM ECC on nodes without memory.
> >
> > Thanks,
> > Yazen
> >
> > Link:
> > https://lkml.kernel.org/r/20191018153114.39378-1-Yazen.Ghannam@amd.com
> >
> > Yazen Ghannam (6):
> >   EDAC/amd64: Make struct amd64_family_type global
> >   EDAC/amd64: Gather hardware information early
> >   EDAC/amd64: Save max number of controllers to family type
> >   EDAC/amd64: Use cached data when checking for ECC
> >   EDAC/amd64: Check for memory before fully initializing an instance
> >   EDAC/amd64: Set grain per DIMM
> >
> >  drivers/edac/amd64_edac.c | 196 +++++++++++++++++++-------------------
> >  drivers/edac/amd64_edac.h |   2 +
> >  2 files changed, 100 insertions(+), 98 deletions(-)
> 
> Almost there: now it dumps the whole shebang twice. This is on an old
> F10h box which doesn't have ECC DIMMs:
> 
> [    2.222853] EDAC MC: Ver: 3.0.0
> [    2.226881] EDAC DEBUG: edac_mc_sysfs_init: device mc created
> [    5.726912] EDAC amd64: F10h detected (node 0).
...
> [    6.208087] EDAC amd64: F10h detected (node 0).

Is the module being probed twice? We have this problem in general, e.g. the
module gets loaded multiple times on failure.

The clue for me is that node 0 gets detected twice. This is done in
per_family_init() early in probe_one_instance().

In any case, I think we can make !ecc_enabled(pvt) in probe_one_instance() a
failure now that we have an explicit check for memory on a node. In other
words, if we have memory and ECC is disabled then this is a failure for the
module.

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ