lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ee36d03a2894606a571b37f440da36f@huawei.com>
Date: Wed, 23 Oct 2024 16:04:05 +0000
From: Shiju Jose <shiju.jose@...wei.com>
To: Borislav Petkov <bp@...en8.de>
CC: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
	"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "tony.luck@...el.com" <tony.luck@...el.com>,
	"rafael@...nel.org" <rafael@...nel.org>, "lenb@...nel.org" <lenb@...nel.org>,
	"mchehab@...nel.org" <mchehab@...nel.org>, "dan.j.williams@...el.com"
	<dan.j.williams@...el.com>, "dave@...olabs.net" <dave@...olabs.net>,
	"Jonathan Cameron" <jonathan.cameron@...wei.com>, "dave.jiang@...el.com"
	<dave.jiang@...el.com>, "alison.schofield@...el.com"
	<alison.schofield@...el.com>, "vishal.l.verma@...el.com"
	<vishal.l.verma@...el.com>, "ira.weiny@...el.com" <ira.weiny@...el.com>,
	"david@...hat.com" <david@...hat.com>, "Vilas.Sridharan@....com"
	<Vilas.Sridharan@....com>, "leo.duran@....com" <leo.duran@....com>,
	"Yazen.Ghannam@....com" <Yazen.Ghannam@....com>, "rientjes@...gle.com"
	<rientjes@...gle.com>, "jiaqiyan@...gle.com" <jiaqiyan@...gle.com>,
	"Jon.Grimm@....com" <Jon.Grimm@....com>, "dave.hansen@...ux.intel.com"
	<dave.hansen@...ux.intel.com>, "naoya.horiguchi@....com"
	<naoya.horiguchi@....com>, "james.morse@....com" <james.morse@....com>,
	"jthoughton@...gle.com" <jthoughton@...gle.com>, "somasundaram.a@....com"
	<somasundaram.a@....com>, "erdemaktas@...gle.com" <erdemaktas@...gle.com>,
	"pgonda@...gle.com" <pgonda@...gle.com>, "duenwen@...gle.com"
	<duenwen@...gle.com>, "gthelen@...gle.com" <gthelen@...gle.com>,
	"wschwartz@...erecomputing.com" <wschwartz@...erecomputing.com>,
	"dferguson@...erecomputing.com" <dferguson@...erecomputing.com>,
	"wbs@...amperecomputing.com" <wbs@...amperecomputing.com>,
	"nifan.cxl@...il.com" <nifan.cxl@...il.com>, tanxiaofei
	<tanxiaofei@...wei.com>, "Zengtao (B)" <prime.zeng@...ilicon.com>, "Roberto
 Sassu" <roberto.sassu@...wei.com>, "kangkang.shen@...urewei.com"
	<kangkang.shen@...urewei.com>, wanghuiqiang <wanghuiqiang@...wei.com>,
	Linuxarm <linuxarm@...wei.com>
Subject: RE: [PATCH v13 02/18] EDAC: Add scrub control feature


Thanks for the feedbacks.

>-----Original Message-----
>From: Borislav Petkov <bp@...en8.de>
>Sent: 22 October 2024 20:05
>To: Shiju Jose <shiju.jose@...wei.com>
>Cc: linux-edac@...r.kernel.org; linux-cxl@...r.kernel.org; linux-
>acpi@...r.kernel.org; linux-mm@...ck.org; linux-kernel@...r.kernel.org;
>tony.luck@...el.com; rafael@...nel.org; lenb@...nel.org;
>mchehab@...nel.org; dan.j.williams@...el.com; dave@...olabs.net; Jonathan
>Cameron <jonathan.cameron@...wei.com>; dave.jiang@...el.com;
>alison.schofield@...el.com; vishal.l.verma@...el.com; ira.weiny@...el.com;
>david@...hat.com; Vilas.Sridharan@....com; leo.duran@....com;
>Yazen.Ghannam@....com; rientjes@...gle.com; jiaqiyan@...gle.com;
>Jon.Grimm@....com; dave.hansen@...ux.intel.com;
>naoya.horiguchi@....com; james.morse@....com; jthoughton@...gle.com;
>somasundaram.a@....com; erdemaktas@...gle.com; pgonda@...gle.com;
>duenwen@...gle.com; gthelen@...gle.com;
>wschwartz@...erecomputing.com; dferguson@...erecomputing.com;
>wbs@...amperecomputing.com; nifan.cxl@...il.com; tanxiaofei
><tanxiaofei@...wei.com>; Zengtao (B) <prime.zeng@...ilicon.com>; Roberto
>Sassu <roberto.sassu@...wei.com>; kangkang.shen@...urewei.com;
>wanghuiqiang <wanghuiqiang@...wei.com>; Linuxarm
><linuxarm@...wei.com>
>Subject: Re: [PATCH v13 02/18] EDAC: Add scrub control feature
>
>On Wed, Oct 09, 2024 at 01:41:03PM +0100, shiju.jose@...wei.com wrote:
>> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub
>> b/Documentation/ABI/testing/sysfs-edac-scrub
>> new file mode 100644
>> index 000000000000..b4f8f0bba17b
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-edac-scrub
>> @@ -0,0 +1,69 @@
>> +What:		/sys/bus/edac/devices/<dev-name>/scrubX
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory
>> +		belongs to an instance of memory scrub control feature,
>> +		where <dev-name> directory corresponds to a device/memory
>> +		region registered with the EDAC device driver for the
>> +		scrub control feature.
>> +		The sysfs scrub attr nodes would be present only if the
>> +		client driver has implemented the corresponding attr
>> +		callback function and passed in ops to the EDAC RAS feature
>> +		driver during registration.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/addr_range_base
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RW) The base of the address range of the memory region
>> +		to be scrubbed (on-demand scrubbing).
>
>Why does this sound more complicated than it is?
>
>Why isn't this simply "addr" and the next one "size"?
>
>On-demand scrubbing should scrub at address "addr" and "size" bytes. Can't get
>any simpler than that.
Sure. Will modify to addr and size.
>
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/addr_range_size
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RW) The size of the address range of the memory region
>> +		to be scrubbed (on-demand scrubbing).
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/enable_background
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RW) Start/Stop background(patrol) scrubbing if supported.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/enable_on_demand
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RW) Start/Stop on-demand scrubbing the memory region
>> +		if supported.
>
>Why do you need a separate "enable" flag?
>
>Why can't it be: "writing into "addr" starts the on-demand scrubbing"?
If  'enable' attribute is removed , then there is an ordering with setting address + size.
Also user space can't check whether scrubbing is enabled or not. 
>
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/min_cycle_duration
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RO) Supported minimum scrub cycle duration in seconds
>> +		by the memory scrubber.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/max_cycle_duration
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RO) Supported maximum scrub cycle duration in seconds
>> +		by the memory scrubber.
>> +
>> +What:		/sys/bus/edac/devices/<dev-
>name>/scrubX/current_cycle_duration
>> +Date:		Oct 2024
>> +KernelVersion:	6.12
>> +Contact:	linux-edac@...r.kernel.org
>> +Description:
>> +		(RW) The current scrub cycle duration in seconds and must be
>> +		within the supported range by the memory scrubber.
>
>What are those three good for and why are they exposed?
Scrub has an overhead when running and that may want to be reduced by
just taking longer to do it. 
Min and max scrub cycle duration returns the range of scrub rate
supported by the device.
>
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index
>> 4edfb83ffbee..a96a74de8b9e 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC)			:= edac_core.o
>>
>>  edac_core-y	:= edac_mc.o edac_device.o edac_mc_sysfs.o
>>  edac_core-y	+= edac_module.o edac_device_sysfs.o wq.o
>> +edac_core-y	+= scrub.o
>>
>>  edac_core-$(CONFIG_EDAC_DEBUG)		+= debugfs.o
>>
>> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
>> index 0b8aa8150239..0c9da55d18bc 100644
>> --- a/drivers/edac/edac_device.c
>> +++ b/drivers/edac/edac_device.c
>> @@ -576,6 +576,7 @@ static void edac_dev_release(struct device *dev)
>> {
>>  	struct edac_dev_feat_ctx *ctx = container_of(dev, struct
>> edac_dev_feat_ctx, dev);
>>
>> +	kfree(ctx->scrub);
>>  	kfree(ctx->dev.groups);
>>  	kfree(ctx);
>>  }
>> @@ -610,7 +611,9 @@ int edac_dev_register(struct device *parent, char
>*name,
>>  		      const struct edac_dev_feature *ras_features)  {
>>  	const struct attribute_group **ras_attr_groups;
>> +	struct edac_dev_data *dev_data;
>>  	struct edac_dev_feat_ctx *ctx;
>> +	int scrub_cnt = 0, scrub_inst = 0;
>>  	int attr_gcnt = 0;
>>  	int ret, feat;
>
>The EDAC tree preferred ordering of variable declarations at the beginning of a
>function is reverse fir tree order::
I used the reverse tree ordering, but missed here. I will correct. 
>
>	struct long_struct_name *descriptive_name;
>	unsigned long foo, bar;
>	unsigned int tmp;
>	int ret;
>
>The above is faster to parse than the reverse ordering::
>
>	int ret;
>	unsigned int tmp;
>	unsigned long foo, bar;
>	struct long_struct_name *descriptive_name;
>
>And even more so than random ordering::
>
>	unsigned long foo, bar;
>	int ret;
>	struct long_struct_name *descriptive_name;
>	unsigned int tmp;
>
>>
>> @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char
>*name,
>>  	/* Double parse to make space for attributes */
>>  	for (feat = 0; feat < num_features; feat++) {
>>  		switch (ras_features[feat].ft_type) {
>> -		/* Add feature specific code */
>> +		case RAS_FEAT_SCRUB:
>> +			attr_gcnt++;
>> +			scrub_cnt++;
>> +			break;
>>  		default:
>>  			return -EINVAL;
>>  		}
>> @@ -639,13 +645,36 @@ int edac_dev_register(struct device *parent, char
>*name,
>>  		goto ctx_free;
>>  	}
>>
>> +	if (scrub_cnt) {
>> +		ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub),
>GFP_KERNEL);
>> +		if (!ctx->scrub) {
>> +			ret = -ENOMEM;
>> +			goto groups_free;
>> +		}
>> +	}
>> +
>>  	attr_gcnt = 0;
>>  	for (feat = 0; feat < num_features; feat++, ras_features++) {
>>  		switch (ras_features->ft_type) {
>> -		/* Add feature specific code */
>> +		case RAS_FEAT_SCRUB:
>> +			if (!ras_features->scrub_ops)
>> +				continue;
>
>Continue?
>
>I think fail. What is a scrub feature good for if it doesn't have ops?
Here continue to check any other feature (for eg. ECS, memory repair or another scrub instance) listed
by the parent device in the ras_features[].   
>
>> +			if (scrub_inst != ras_features->instance)
>> +				goto data_mem_free;
>> +			dev_data = &ctx->scrub[scrub_inst];
>> +			dev_data->instance = scrub_inst;
>> +			dev_data->scrub_ops = ras_features->scrub_ops;
>> +			dev_data->private = ras_features->ctx;
>> +			ret = edac_scrub_get_desc(parent,
>&ras_attr_groups[attr_gcnt],
>> +						  ras_features->instance);
>> +			if (ret)
>> +				goto data_mem_free;
>> +			scrub_inst++;
>> +			attr_gcnt++;
>> +			break;
>>  		default:
>>  			ret = -EINVAL;
>> -			goto groups_free;
>> +			goto data_mem_free;
>>  		}
>>  	}
>>
>
>...
>
>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette

Thanks,
Shiju

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ