[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240510100305.00000a2b@Huawei.com>
Date: Fri, 10 May 2024 10:03:05 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Dan Williams <dan.j.williams@...el.com>
CC: <shiju.jose@...wei.com>, <linux-cxl@...r.kernel.org>,
<linux-acpi@...r.kernel.org>, <linux-mm@...ck.org>, <dave@...olabs.net>,
<dave.jiang@...el.com>, <alison.schofield@...el.com>,
<vishal.l.verma@...el.com>, <ira.weiny@...el.com>,
<linux-edac@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<david@...hat.com>, <Vilas.Sridharan@....com>, <leo.duran@....com>,
<Yazen.Ghannam@....com>, <rientjes@...gle.com>, <jiaqiyan@...gle.com>,
<tony.luck@...el.com>, <Jon.Grimm@....com>, <dave.hansen@...ux.intel.com>,
<rafael@...nel.org>, <lenb@...nel.org>, <naoya.horiguchi@....com>,
<james.morse@....com>, <jthoughton@...gle.com>, <somasundaram.a@....com>,
<erdemaktas@...gle.com>, <pgonda@...gle.com>, <duenwen@...gle.com>,
<mike.malvestuto@...el.com>, <gthelen@...gle.com>,
<wschwartz@...erecomputing.com>, <dferguson@...erecomputing.com>,
<wbs@...amperecomputing.com>, <nifan.cxl@...il.com>, <tanxiaofei@...wei.com>,
<prime.zeng@...ilicon.com>, <kangkang.shen@...urewei.com>,
<wanghuiqiang@...wei.com>, <linuxarm@...wei.com>
Subject: Re: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem
On Thu, 9 May 2024 14:47:56 -0700
Dan Williams <dan.j.williams@...el.com> wrote:
> shiju.jose@ wrote:
> > From: Shiju Jose <shiju.jose@...wei.com>
> >
> > Add scrub subsystem supports configuring the memory scrubbers
> > in the system. The scrub subsystem provides the interface for
> > registering the scrub devices. The scrub control attributes
> > are provided to the user in /sys/class/ras/rasX/scrub
> >
> > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> > Signed-off-by: Shiju Jose <shiju.jose@...wei.com>
> > ---
> > .../ABI/testing/sysfs-class-scrub-configure | 47 +++
> > drivers/ras/Kconfig | 7 +
> > drivers/ras/Makefile | 1 +
> > drivers/ras/memory_scrub.c | 271 ++++++++++++++++++
> > include/linux/memory_scrub.h | 37 +++
> > 5 files changed, 363 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure
> > create mode 100755 drivers/ras/memory_scrub.c
> > create mode 100755 include/linux/memory_scrub.h
> >
> > diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure
> > new file mode 100644
> > index 000000000000..3ed77dbb00ad
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure
> > @@ -0,0 +1,47 @@
> > +What: /sys/class/ras/
> > +Date: March 2024
> > +KernelVersion: 6.9
> > +Contact: linux-kernel@...r.kernel.org
> > +Description:
> > + The ras/ class subdirectory belongs to the
> > + common ras features such as scrub subsystem.
Hi Dan,
>
> Why create "ras" class versus just a "srcub" class? I am otherwise not
> aware of a precedent for class device hierarchy. For example, on my
> system there is:
I think that's miss described - aim is on subsystem, the first feature
supported is scrub. Intent here is to group RAS features of a given
device / interface etc into one place. This was a request in an review
of an earlier version on basis these interfaces tend to get grouped together
in a device.
So options are
/sys/class/ras/cxl_mem0/scrub/rate etc.
/sys/class/ras/cxl_mem0/ecs/rate etc
(maybe separate for ECS because it annoyingly looks nothing like scrub despite name
and there are multiple impelmentations)
vs
/sys/class/ras/cxl_mem0_scrub
/sys/class/ras/cxl_mem0_ecs
etc
Note that generic naming not including what the source was got
negative reviews in favor of making that the device instance name here.
So that rulled out simply
/sys/class/ras/scrubX/
/sys/class/ras/ecsX/
I don't mind which way we go; both are extensible.
>
> /sys/class/
> ├── scsi_device
> ├── scsi_disk
> ├── scsi_generic
> └── scsi_host
>
> ...not:
>
> /sys/class/scsi/
> ├── device
> ├── disk
> ├── generic
> └── host
That's a docs problem - this was never the intent.
>
>
> > +
> > +What: /sys/class/ras/rasX/scrub/
> > +Date: March 2024
> > +KernelVersion: 6.9
> > +Contact: linux-kernel@...r.kernel.org
> > +Description:
> > + The /sys/class/ras/ras{0,1,2,3,...}/scrub directories
> > + correspond to each scrub device registered with the
> > + scrub subsystem.
>
> I notice there are some visibility rules in the code, but those
> expectations are not documented here.
>
> This documentation would also help developers writing new users of
> scrub_device_register().
Agreed. One to improve.
>
> > +
> > +What: /sys/class/ras/rasX/scrub/name
> > +Date: March 2024
> > +KernelVersion: 6.9
> > +Contact: linux-kernel@...r.kernel.org
> > +Description:
> > + (RO) name of the memory scrubber
> > +
> > +What: /sys/class/ras/rasX/scrub/enable_background
> > +Date: March 2024
> > +KernelVersion: 6.9
> > +Contact: linux-kernel@...r.kernel.org
> > +Description:
> > + (RW) Enable/Disable background(patrol) scrubbing if supported.
> > +
> > +What: /sys/class/ras/rasX/scrub/rate_available
> > +Date: March 2024
> > +KernelVersion: 6.9
> > +Contact: linux-kernel@...r.kernel.org
> > +Description:
> > + (RO) Supported range for the scrub rate by the scrubber.
> > + The scrub rate represents in hours.
> > +
> > +What: /sys/class/ras/rasX/scrub/rate
> > +Date: March 2024
> > +KernelVersion: 6.9
> > +Contact: linux-kernel@...r.kernel.org
> > +Description:
> > + (RW) The scrub rate specified and it must be with in the
> > + supported range by the scrubber.
> > + The scrub rate represents in hours.
> > diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
> > index fc4f4bb94a4c..181701479564 100644
> > --- a/drivers/ras/Kconfig
> > +++ b/drivers/ras/Kconfig
> > @@ -46,4 +46,11 @@ config RAS_FMPM
> > Memory will be retired during boot time and run time depending on
> > platform-specific policies.
> >
> > +config SCRUB
> > + tristate "Memory scrub driver"
> > + help
> > + This option selects the memory scrub subsystem, supports
> > + configuring the parameters of underlying scrubbers in the
> > + system for the DRAM memories.
> > +
> > endif
> > diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
> > index 11f95d59d397..89bcf0d84355 100644
> > --- a/drivers/ras/Makefile
> > +++ b/drivers/ras/Makefile
> > @@ -2,6 +2,7 @@
> > obj-$(CONFIG_RAS) += ras.o
> > obj-$(CONFIG_DEBUG_FS) += debugfs.o
> > obj-$(CONFIG_RAS_CEC) += cec.o
> > +obj-$(CONFIG_SCRUB) += memory_scrub.o
> >
> > obj-$(CONFIG_RAS_FMPM) += amd/fmpm.o
> > obj-y += amd/atl/
> > diff --git a/drivers/ras/memory_scrub.c b/drivers/ras/memory_scrub.c
> > new file mode 100755
> > index 000000000000..7e995380ec3a
> > --- /dev/null
> > +++ b/drivers/ras/memory_scrub.c
> > @@ -0,0 +1,271 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Memory scrub subsystem supports configuring the registered
> > + * memory scrubbers.
> > + *
> > + * Copyright (c) 2024 HiSilicon Limited.
> > + */
> > +
> > +#define pr_fmt(fmt) "MEM SCRUB: " fmt
> > +
> > +#include <linux/acpi.h>
> > +#include <linux/bitops.h>
> > +#include <linux/delay.h>
> > +#include <linux/kfifo.h>
> > +#include <linux/memory_scrub.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/spinlock.h>
> > +
> > +/* memory scrubber config definitions */
> > +#define SCRUB_ID_PREFIX "ras"
> > +#define SCRUB_ID_FORMAT SCRUB_ID_PREFIX "%d"
> > +
> > +static DEFINE_IDA(scrub_ida);
> > +
> > +struct scrub_device {
> > + int id;
> > + struct device dev;
> > + const struct scrub_ops *ops;
> > +};
> > +
> > +#define to_scrub_device(d) container_of(d, struct scrub_device, dev)
> > +static ssize_t enable_background_store(struct device *dev,
> > + struct device_attribute *attr,
> > + const char *buf, size_t len)
> > +{
> > + struct scrub_device *scrub_dev = to_scrub_device(dev);
> > + bool enable;
> > + int ret;
> > +
> > + ret = kstrtobool(buf, &enable);
> > + if (ret < 0)
> > + return ret;
> > +
> > + ret = scrub_dev->ops->set_enabled_bg(dev, enable);
> > + if (ret)
> > + return ret;
>
> It strikes me as somewhat pointless to have such a thin sysfs
> implementation whose only job is to call down into a callback to do the
> work. Unless there are other consumers of 'struct scrub_ops' outside of
> these sysfs files why not just have the low-level drivers register their
> corresponding attributes themselves?
>
> Unless the functionality is truly generic just let the low-level driver
> be responsible for conforming to the sysfs ABI expectations, and, for
> example, each register their own "enable_background" attribute if they
> support that semantic.
This was me pushing for this based on that approach having been a pain
in subystems I've been involved with in the past. so I'll answer.
Maybe if we think the number of scrub drivers remains very low we can
rely on ABI review. However, it's painful. Everyone wants to add
their own custom ABI, so every review consists of 'no that is
isn't consistent' reviews. The callback schemes reduce that considerably.
As someone with their name next to one of the largest sysfs ABIs in the
kernel, maybe I'm projecting my pain points on this one.
Note that this approach has failed for multiple similar simple subsystems
in the past and they have migrated to a (mostly) tighter description for
ABI simply because those constraints are useful. A fairly recent one
maybe 8 years ago? Was hwmon. There are other advantages that may not
yet apply here (in kernel interfaces are much easier, even if they are
only occasionally used for a given subsystem), but my motivation in
pushing Shiju this way was to lock down the userspace interface.
>
> So scrub_device_register() would grow a 'const struct attribute_group
> **groups' argument, or something along those lines.
Sure. Shiju had that in an earlier version. Personally I think it's
an approach that may bite in the long run, but meh, maybe this will
only ever have half a dozen drivers so it might remain manageable.
If not, I love say 'I told you so' :)
Jonathan
Powered by blists - more mailing lists