lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Sep 2018 14:44:01 -0700
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Russ Anderson <rja@....com>,
        Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        Justin Ernst <justin.ernst@....com>, russ.anderson@....com,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        Aristeu Rozanski Filho <arozansk@...hat.com>
Subject: Re: [PATCH] Raise maximum number of memory controllers

On Thu, Sep 27, 2018 at 06:52:44AM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 04:02:57PM -0700, Luck, Tony wrote:
> > But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
> > the proper solution in the next 3-4 weeks. So perhaps we should apply
> > 
> > -#define EDAC_MAX_MCS   16
> > +#define EDAC_MAX_MCS   64
> > 
> > as a temporary band-aid to get HPE's 32-socket machine running while
> > we work on the proper fix?
> 
> Yeah, after sleeping on it I see it the same way - band-aid it now and
> clean it up properly later.

The problem with your patch that gets rid of EDAC_MAX_MCS is making
device links under /sys/bus/edac.  Which is hinted at in some of the
code your patch deleted:

-       /*
-        * The memory controller needs its own bus, in order to avoid
-        * namespace conflicts at /sys/bus/edac.
-        */
-       name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
-       if (!name)
-               return -ENOMEM;
-
-       mci->bus->name = name;
-
-       edac_dbg(0, "creating bus %s\n", mci->bus->name);
-
-       err = bus_register(mci->bus);

Just to see if there was anything else wrong I added a patch to
make the names unique:


diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 2ca2012f2857..6ec6d8a2adb8 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -410,7 +410,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	device_initialize(&csrow->dev);
 	csrow->dev.parent = &mci->dev;
 	csrow->mci = mci;
-	dev_set_name(&csrow->dev, "csrow%d", index);
+	dev_set_name(&csrow->dev, "mci%d_csrow%d", mci->mc_idx, index);
 	dev_set_drvdata(&csrow->dev, csrow);
 
 	edac_dbg(0, "creating (virtual) csrow node %s\n",
@@ -641,9 +641,9 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	dimm->dev.parent = &mci->dev;
 	if (mci->csbased)
-		dev_set_name(&dimm->dev, "rank%d", index);
+		dev_set_name(&dimm->dev, "mci%d_rank%d", mci->mc_idx, index);
 	else
-		dev_set_name(&dimm->dev, "dimm%d", index);
+		dev_set_name(&dimm->dev, "mci%d_dimm%d", mci->mc_idx, index);
 	dev_set_drvdata(&dimm->dev, dimm);
 	pm_runtime_forbid(&mci->dev);
 

which seemed to work.  But then I began wondering what are ABI expectations
from applications that read the EDAC /sys files?

Is this this current source repository?  https://github.com/grondo/edac-utils

This code doesn't seem to know about the "dimm*" directories below the
"mc*" level. It just looks for the csrow* entries.

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ