[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180927214400.GA2249@agluck-desk>
Date: Thu, 27 Sep 2018 14:44:01 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Russ Anderson <rja@....com>,
Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
Greg KH <gregkh@...uxfoundation.org>,
Justin Ernst <justin.ernst@....com>, russ.anderson@....com,
Mauro Carvalho Chehab <mchehab@...nel.org>,
linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
Aristeu Rozanski Filho <arozansk@...hat.com>
Subject: Re: [PATCH] Raise maximum number of memory controllers
On Thu, Sep 27, 2018 at 06:52:44AM +0200, Borislav Petkov wrote:
> On Wed, Sep 26, 2018 at 04:02:57PM -0700, Luck, Tony wrote:
> > But ... we are at -rc5. Not sure that we'll figure out, write, test & debug
> > the proper solution in the next 3-4 weeks. So perhaps we should apply
> >
> > -#define EDAC_MAX_MCS 16
> > +#define EDAC_MAX_MCS 64
> >
> > as a temporary band-aid to get HPE's 32-socket machine running while
> > we work on the proper fix?
>
> Yeah, after sleeping on it I see it the same way - band-aid it now and
> clean it up properly later.
The problem with your patch that gets rid of EDAC_MAX_MCS is making
device links under /sys/bus/edac. Which is hinted at in some of the
code your patch deleted:
- /*
- * The memory controller needs its own bus, in order to avoid
- * namespace conflicts at /sys/bus/edac.
- */
- name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
- if (!name)
- return -ENOMEM;
-
- mci->bus->name = name;
-
- edac_dbg(0, "creating bus %s\n", mci->bus->name);
-
- err = bus_register(mci->bus);
Just to see if there was anything else wrong I added a patch to
make the names unique:
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 2ca2012f2857..6ec6d8a2adb8 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -410,7 +410,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
device_initialize(&csrow->dev);
csrow->dev.parent = &mci->dev;
csrow->mci = mci;
- dev_set_name(&csrow->dev, "csrow%d", index);
+ dev_set_name(&csrow->dev, "mci%d_csrow%d", mci->mc_idx, index);
dev_set_drvdata(&csrow->dev, csrow);
edac_dbg(0, "creating (virtual) csrow node %s\n",
@@ -641,9 +641,9 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
dimm->dev.parent = &mci->dev;
if (mci->csbased)
- dev_set_name(&dimm->dev, "rank%d", index);
+ dev_set_name(&dimm->dev, "mci%d_rank%d", mci->mc_idx, index);
else
- dev_set_name(&dimm->dev, "dimm%d", index);
+ dev_set_name(&dimm->dev, "mci%d_dimm%d", mci->mc_idx, index);
dev_set_drvdata(&dimm->dev, dimm);
pm_runtime_forbid(&mci->dev);
which seemed to work. But then I began wondering what are ABI expectations
from applications that read the EDAC /sys files?
Is this this current source repository? https://github.com/grondo/edac-utils
This code doesn't seem to know about the "dimm*" directories below the
"mc*" level. It just looks for the csrow* entries.
-Tony
Powered by blists - more mailing lists