[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241030161628.00001fdc@Huawei.com>
Date: Wed, 30 Oct 2024 16:16:28 +0000
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Dave Jiang <dave.jiang@...el.com>
CC: Shiju Jose <shiju.jose@...wei.com>, "linux-edac@...r.kernel.org"
<linux-edac@...r.kernel.org>, "linux-cxl@...r.kernel.org"
<linux-cxl@...r.kernel.org>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "bp@...en8.de"
<bp@...en8.de>, "tony.luck@...el.com" <tony.luck@...el.com>,
"rafael@...nel.org" <rafael@...nel.org>, "lenb@...nel.org" <lenb@...nel.org>,
"mchehab@...nel.org" <mchehab@...nel.org>, "dan.j.williams@...el.com"
<dan.j.williams@...el.com>, "dave@...olabs.net" <dave@...olabs.net>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"sudeep.holla@....com" <sudeep.holla@....com>, "jassisinghbrar@...il.com"
<jassisinghbrar@...il.com>, "alison.schofield@...el.com"
<alison.schofield@...el.com>, "vishal.l.verma@...el.com"
<vishal.l.verma@...el.com>, "ira.weiny@...el.com" <ira.weiny@...el.com>,
"david@...hat.com" <david@...hat.com>, "Vilas.Sridharan@....com"
<Vilas.Sridharan@....com>, "leo.duran@....com" <leo.duran@....com>,
"Yazen.Ghannam@....com" <Yazen.Ghannam@....com>, "rientjes@...gle.com"
<rientjes@...gle.com>, "jiaqiyan@...gle.com" <jiaqiyan@...gle.com>,
"Jon.Grimm@....com" <Jon.Grimm@....com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "naoya.horiguchi@....com"
<naoya.horiguchi@....com>, "james.morse@....com" <james.morse@....com>,
"jthoughton@...gle.com" <jthoughton@...gle.com>, "somasundaram.a@....com"
<somasundaram.a@....com>, "erdemaktas@...gle.com" <erdemaktas@...gle.com>,
"pgonda@...gle.com" <pgonda@...gle.com>, "duenwen@...gle.com"
<duenwen@...gle.com>, "gthelen@...gle.com" <gthelen@...gle.com>,
"wschwartz@...erecomputing.com" <wschwartz@...erecomputing.com>,
"dferguson@...erecomputing.com" <dferguson@...erecomputing.com>,
"wbs@...amperecomputing.com" <wbs@...amperecomputing.com>,
"nifan.cxl@...il.com" <nifan.cxl@...il.com>, tanxiaofei
<tanxiaofei@...wei.com>, "Zengtao (B)" <prime.zeng@...ilicon.com>, "Roberto
Sassu" <roberto.sassu@...wei.com>, "kangkang.shen@...urewei.com"
<kangkang.shen@...urewei.com>, wanghuiqiang <wanghuiqiang@...wei.com>,
Linuxarm <linuxarm@...wei.com>
Subject: Re: [PATCH v14 07/14] cxl/memfeature: Add CXL memory device patrol
scrub control feature
On Tue, 29 Oct 2024 11:32:47 -0700
Dave Jiang <dave.jiang@...el.com> wrote:
> On 10/29/24 10:00 AM, Shiju Jose wrote:
> >
> >
> >> -----Original Message-----
> >> From: Dave Jiang <dave.jiang@...el.com>
> >> Sent: 29 October 2024 16:32
> >> To: Shiju Jose <shiju.jose@...wei.com>; linux-edac@...r.kernel.org; linux-
> >> cxl@...r.kernel.org; linux-acpi@...r.kernel.org; linux-mm@...ck.org; linux-
> >> kernel@...r.kernel.org
> >> Cc: bp@...en8.de; tony.luck@...el.com; rafael@...nel.org; lenb@...nel.org;
> >> mchehab@...nel.org; dan.j.williams@...el.com; dave@...olabs.net; Jonathan
> >> Cameron <jonathan.cameron@...wei.com>; gregkh@...uxfoundation.org;
> >> sudeep.holla@....com; jassisinghbrar@...il.com; alison.schofield@...el.com;
> >> vishal.l.verma@...el.com; ira.weiny@...el.com; david@...hat.com;
> >> Vilas.Sridharan@....com; leo.duran@....com; Yazen.Ghannam@....com;
> >> rientjes@...gle.com; jiaqiyan@...gle.com; Jon.Grimm@....com;
> >> dave.hansen@...ux.intel.com; naoya.horiguchi@....com;
> >> james.morse@....com; jthoughton@...gle.com; somasundaram.a@....com;
> >> erdemaktas@...gle.com; pgonda@...gle.com; duenwen@...gle.com;
> >> gthelen@...gle.com; wschwartz@...erecomputing.com;
> >> dferguson@...erecomputing.com; wbs@...amperecomputing.com;
> >> nifan.cxl@...il.com; tanxiaofei <tanxiaofei@...wei.com>; Zengtao (B)
> >> <prime.zeng@...ilicon.com>; Roberto Sassu <roberto.sassu@...wei.com>;
> >> kangkang.shen@...urewei.com; wanghuiqiang <wanghuiqiang@...wei.com>;
> >> Linuxarm <linuxarm@...wei.com>
> >> Subject: Re: [PATCH v14 07/14] cxl/memfeature: Add CXL memory device patrol
> >> scrub control feature
> >>
> >>
> >>
> >> On 10/25/24 10:13 AM, shiju.jose@...wei.com wrote:
> >>> From: Shiju Jose <shiju.jose@...wei.com>
> >>>
> >>> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub
> >>> control feature. The device patrol scrub proactively locates and makes
> >>> corrections to errors in regular cycle.
> >>>
> >>> Allow specifying the number of hours within which the patrol scrub
> >>> must be completed, subject to minimum and maximum limits reported by the
> >> device.
> >>> Also allow disabling scrub allowing trade-off error rates against
> >>> performance.
> >>>
> >>> Add support for patrol scrub control on CXL memory devices.
> >>> Register with the EDAC device driver, which retrieves the scrub
> >>> attribute descriptors from EDAC scrub and exposes the sysfs scrub
> >>> control attributes to userspace. For example, scrub control for the
> >>> CXL memory device "cxl_mem0" is exposed in
> >> /sys/bus/edac/devices/cxl_mem0/scrubX/.
> >>>
> >>> Additionally, add support for region-based CXL memory patrol scrub control.
> >>> CXL memory regions may be interleaved across one or more CXL memory
> >>> devices. For example, region-based scrub control for "cxl_region1" is
> >>> exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.
> >>>
> >>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> >>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> >>> Signed-off-by: Shiju Jose <shiju.jose@...wei.com>
> >>> ---
> >>> Documentation/edac/edac-scrub.rst | 74 ++++++
> >>> drivers/cxl/Kconfig | 18 ++
> >>> drivers/cxl/core/Makefile | 1 +
> >>> drivers/cxl/core/memfeature.c | 381 ++++++++++++++++++++++++++++++
> >>> drivers/cxl/core/region.c | 6 +
> >>> drivers/cxl/cxlmem.h | 7 +
> >>> drivers/cxl/mem.c | 4 +
> >>> 7 files changed, 491 insertions(+)
> >>> create mode 100644 Documentation/edac/edac-scrub.rst create mode
> >>> 100644 drivers/cxl/core/memfeature.c
> >>>
> >>> diff --git a/Documentation/edac/edac-scrub.rst
> >>> b/Documentation/edac/edac-scrub.rst
> >>> new file mode 100644
> >>> index 000000000000..4aad4974b208
> >>> --- /dev/null
> >>> +++ b/Documentation/edac/edac-scrub.rst
> >>> @@ -0,0 +1,74 @@
> >>> +.. SPDX-License-Identifier: GPL-2.0
> >>> +
> > [...]
> >
> >>> +static int cxl_mem_ps_get_attrs(struct cxl_memdev_state *mds,
> >>> + struct cxl_memdev_ps_params *params) {
> >>> + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
> >>> + size_t data_size;
> >>> + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
> >>> + kmalloc(rd_data_size,
> >> GFP_KERNEL);
> >>> + if (!rd_attrs)
> >>> + return -ENOMEM;
> >>> +
> >>> + data_size = cxl_get_feature(mds, cxl_patrol_scrub_uuid,
> >>> + CXL_GET_FEAT_SEL_CURRENT_VALUE,
> >>> + rd_attrs, rd_data_size);
> >>> + if (!data_size)
> >>> + return -EIO;
> >>> +
> >>> + params->scrub_cycle_changeable =
> >> FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,
> >>> + rd_attrs->scrub_cycle_cap);
> >>> + params->enable =
> >> FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> >>> + rd_attrs->scrub_flags);
> >>> + params->scrub_cycle_hrs =
> >> FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> >>> + rd_attrs->scrub_cycle_hrs);
> >>> + params->min_scrub_cycle_hrs =
> >> FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,
> >>> + rd_attrs->scrub_cycle_hrs);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int cxl_ps_get_attrs(struct device *dev, void *drv_data,
> >>
> >> Would a union be better than a void *drv_data for all the places this is used as a
> >> parameter? How many variations of this are there?
> >>
> >> DJ
> > Hi Dave,
> >
> > Can you give more info on this given this is a generic callback for the scrub control and each
> > implementation will have its own context struct (for eg. struct cxl_patrol_scrub_context here
> > for CXL scrub control), which in turn will be passed in and out as opaque data.
>
> Mainly I'm just seeing a lot of calls with (void *). Just asking if we want to make it a union that contains 'struct cxl_patrol_scrub_context' and etc.
You could but then every new driver would need to include
changes in the edac core to add it's own entry to that union.
Not sure that's a good way to go for opaque driver specific context.
This particular function though can use
a struct cxl_patrol_scrub_context * anyway as it's not part of the
core interface, but rather one called only indirectly
by functions that are passed a void * but know it is a
struct clx_patrol_scrub_context *.
Jonathan
>
> >
> > Thanks,
> > Shiju
> >>
> >>> + struct cxl_memdev_ps_params *params) {
> >>> + struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
> >>> + struct cxl_memdev *cxlmd;
> >>> + struct cxl_dev_state *cxlds;
> >>> + struct cxl_memdev_state *mds;
> >>> + u16 min_scrub_cycle = 0;
> >>> + int i, ret;
> >>> +
> >>> + if (cxl_ps_ctx->cxlr) {
> >>> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> >>> + struct cxl_region_params *p = &cxlr->params;
> >>> +
> >>> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> >>> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> >>> +
> >>> + cxlmd = cxled_to_memdev(cxled);
> >>> + cxlds = cxlmd->cxlds;
> >>> + mds = to_cxl_memdev_state(cxlds);
> >>> + ret = cxl_mem_ps_get_attrs(mds, params);
> >>> + if (ret)
> >>> + return ret;
> >>> +
> >>> + if (params->min_scrub_cycle_hrs > min_scrub_cycle)
> >>> + min_scrub_cycle = params-
> >>> min_scrub_cycle_hrs;
> >>> + }
> >>> + params->min_scrub_cycle_hrs = min_scrub_cycle;
> >>> + return 0;
> >>> + }
> >>> + cxlmd = cxl_ps_ctx->cxlmd;
> >>> + cxlds = cxlmd->cxlds;
> >>> + mds = to_cxl_memdev_state(cxlds);
> >>> +
> >>> + return cxl_mem_ps_get_attrs(mds, params); }
> >>> +
> > [...]
> >>
> >
>
>
Powered by blists - more mailing lists