[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250115130349.655c5461@foz.lan>
Date: Wed, 15 Jan 2025 13:03:49 +0100
From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
To: Shiju Jose <shiju.jose@...wei.com>
Cc: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>,
"tony.luck@...el.com" <tony.luck@...el.com>, "rafael@...nel.org"
<rafael@...nel.org>, "lenb@...nel.org" <lenb@...nel.org>,
"mchehab@...nel.org" <mchehab@...nel.org>, "dan.j.williams@...el.com"
<dan.j.williams@...el.com>, "dave@...olabs.net" <dave@...olabs.net>,
"Jonathan Cameron" <jonathan.cameron@...wei.com>, "dave.jiang@...el.com"
<dave.jiang@...el.com>, "alison.schofield@...el.com"
<alison.schofield@...el.com>, "vishal.l.verma@...el.com"
<vishal.l.verma@...el.com>, "ira.weiny@...el.com" <ira.weiny@...el.com>,
"david@...hat.com" <david@...hat.com>, "Vilas.Sridharan@....com"
<Vilas.Sridharan@....com>, "leo.duran@....com" <leo.duran@....com>,
"Yazen.Ghannam@....com" <Yazen.Ghannam@....com>, "rientjes@...gle.com"
<rientjes@...gle.com>, "jiaqiyan@...gle.com" <jiaqiyan@...gle.com>,
"Jon.Grimm@....com" <Jon.Grimm@....com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "naoya.horiguchi@....com"
<naoya.horiguchi@....com>, "james.morse@....com" <james.morse@....com>,
"jthoughton@...gle.com" <jthoughton@...gle.com>, "somasundaram.a@....com"
<somasundaram.a@....com>, "erdemaktas@...gle.com" <erdemaktas@...gle.com>,
"pgonda@...gle.com" <pgonda@...gle.com>, "duenwen@...gle.com"
<duenwen@...gle.com>, "gthelen@...gle.com" <gthelen@...gle.com>,
"wschwartz@...erecomputing.com" <wschwartz@...erecomputing.com>,
"dferguson@...erecomputing.com" <dferguson@...erecomputing.com>,
"wbs@...amperecomputing.com" <wbs@...amperecomputing.com>,
"nifan.cxl@...il.com" <nifan.cxl@...il.com>, tanxiaofei
<tanxiaofei@...wei.com>, "Zengtao (B)" <prime.zeng@...ilicon.com>, "Roberto
Sassu" <roberto.sassu@...wei.com>, "kangkang.shen@...urewei.com"
<kangkang.shen@...urewei.com>, wanghuiqiang <wanghuiqiang@...wei.com>,
Linuxarm <linuxarm@...wei.com>
Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature
Em Tue, 14 Jan 2025 14:30:53 +0000
Shiju Jose <shiju.jose@...wei.com> escreveu:
> >-----Original Message-----
> >From: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
> >Sent: 14 January 2025 13:47
> >To: Shiju Jose <shiju.jose@...wei.com>
> >Cc: linux-edac@...r.kernel.org; linux-cxl@...r.kernel.org; linux-
> >acpi@...r.kernel.org; linux-mm@...ck.org; linux-kernel@...r.kernel.org;
> >bp@...en8.de; tony.luck@...el.com; rafael@...nel.org; lenb@...nel.org;
> >mchehab@...nel.org; dan.j.williams@...el.com; dave@...olabs.net; Jonathan
> >Cameron <jonathan.cameron@...wei.com>; dave.jiang@...el.com;
> >alison.schofield@...el.com; vishal.l.verma@...el.com; ira.weiny@...el.com;
> >david@...hat.com; Vilas.Sridharan@....com; leo.duran@....com;
> >Yazen.Ghannam@....com; rientjes@...gle.com; jiaqiyan@...gle.com;
> >Jon.Grimm@....com; dave.hansen@...ux.intel.com;
> >naoya.horiguchi@....com; james.morse@....com; jthoughton@...gle.com;
> >somasundaram.a@....com; erdemaktas@...gle.com; pgonda@...gle.com;
> >duenwen@...gle.com; gthelen@...gle.com;
> >wschwartz@...erecomputing.com; dferguson@...erecomputing.com;
> >wbs@...amperecomputing.com; nifan.cxl@...il.com; tanxiaofei
> ><tanxiaofei@...wei.com>; Zengtao (B) <prime.zeng@...ilicon.com>; Roberto
> >Sassu <roberto.sassu@...wei.com>; kangkang.shen@...urewei.com;
> >wanghuiqiang <wanghuiqiang@...wei.com>; Linuxarm
> ><linuxarm@...wei.com>
> >Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature
> >
> >Em Mon, 6 Jan 2025 12:10:00 +0000
> ><shiju.jose@...wei.com> escreveu:
> >
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/mem_repairX/repair_function
> >> +Date: Jan 2025
> >> +KernelVersion: 6.14
> >> +Contact: linux-edac@...r.kernel.org
> >> +Description:
> >> + (RO) Memory repair function type. For eg. post package repair,
> >> + memory sparing etc.
> >> + EDAC_SOFT_PPR - Soft post package repair
> >> + EDAC_HARD_PPR - Hard post package repair
> >> + EDAC_CACHELINE_MEM_SPARING - Cacheline memory sparing
> >> + EDAC_ROW_MEM_SPARING - Row memory sparing
> >> + EDAC_BANK_MEM_SPARING - Bank memory sparing
> >> + EDAC_RANK_MEM_SPARING - Rank memory sparing
> >> + All other values are reserved.
> >> +
> >> +What: /sys/bus/edac/devices/<dev-
> >name>/mem_repairX/persist_mode
> >> +Date: Jan 2025
> >> +KernelVersion: 6.14
> >> +Contact: linux-edac@...r.kernel.org
> >> +Description:
> >> + (RW) Read/Write the current persist repair mode set for a
> >> + repair function. Persist repair modes supported in the
> >> + device, based on the memory repair function is temporary
> >> + or permanent and is lost with a power cycle.
> >> + EDAC_MEM_REPAIR_SOFT - Soft repair function (temporary
> >repair).
> >> + EDAC_MEM_REPAIR_HARD - Hard memory repair function
> >(permanent repair).
> >> + All other values are reserved.
> >> +
> >
> >After re-reading some things, I suspect that the above can be simplified a little
> >bit by folding soft/hard PPR into a single element at /repair_function, and letting
> >it clearer that persist_mode is valid only for PPR (I think this is the case, right?),
> >e.g. something like:
> persist_mode is valid for memory sparing features(atleast in CXL) as well.
> In the case of CXL memory sparing, host has option to request either soft or hard sparing
> in a flag when issue a memory sparing operation.
Ok.
>
> >
> > What: /sys/bus/edac/devices/<dev-
> >name>/mem_repairX/repair_function
> > ...
> > Description:
> > (RO) Memory repair function type. For e.g. post
> >package repair,
> > memory sparing etc. Valid values are:
> >
> > - ppr - post package repair.
> > Please define its mode via
> > /sys/bus/edac/devices/<dev-
> >name>/mem_repairX/persist_mode
> > - cacheline-sparing - Cacheline memory sparing
> > - row-sparing - Row memory sparing
> > - bank-sparing - Bank memory sparing
> > - rank-sparing - Rank memory sparing
> > - All other values are reserved.
> >
> >and define persist_mode in a different way:
> Note: For return as decoded strings instead of raw value, I need to add some extra callback function/s
> in the edac/memory_repair.c for these attributes and which will reduce the current level of optimization done to
> minimize the code size.
You're already using a callback at EDAC_MEM_REPAIR_ATTR_SHOW macro.
So, no need for any change at the current code, except for the type
used at the EDAC_MEM_REPAIR_ATTR_SHOW() call.
Something similar to this (not tested) would work:
int get_repair_function(struct device *dev, void *drv_data, const char **val)
{
unsigned int type;
// Some logic to get repair type from *drv_data, storing into "unsigned int type"
const char *repair_type[] = {
[EDAC_SOFT_PPR] = "ppr",
[EDAC_HARD_PPR] = "ppr",
[EDAC_CACHELINE_MEM_SPARING] = "cacheline-sparing",
...
}
if (type < ARRAY_SIZE(repair_type)) {
*val = repair_type(type);
return 0;
}
return -EINVAL;
}
EDAC_MEM_REPAIR_ATTR_SHOW(repair_function, get_repair_function, const char *, "%s\n");
Thanks,
Mauro
Powered by blists - more mailing lists