lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250109151854.GCZ3_o3rf6S24qUbtB@fat_crate.local>
Date: Thu, 9 Jan 2025 16:18:54 +0100
From: Borislav Petkov <bp@...en8.de>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: Shiju Jose <shiju.jose@...wei.com>,
	"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
	"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"tony.luck@...el.com" <tony.luck@...el.com>,
	"rafael@...nel.org" <rafael@...nel.org>,
	"lenb@...nel.org" <lenb@...nel.org>,
	"mchehab@...nel.org" <mchehab@...nel.org>,
	"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
	"dave@...olabs.net" <dave@...olabs.net>,
	"dave.jiang@...el.com" <dave.jiang@...el.com>,
	"alison.schofield@...el.com" <alison.schofield@...el.com>,
	"vishal.l.verma@...el.com" <vishal.l.verma@...el.com>,
	"ira.weiny@...el.com" <ira.weiny@...el.com>,
	"david@...hat.com" <david@...hat.com>,
	"Vilas.Sridharan@....com" <Vilas.Sridharan@....com>,
	"leo.duran@....com" <leo.duran@....com>,
	"Yazen.Ghannam@....com" <Yazen.Ghannam@....com>,
	"rientjes@...gle.com" <rientjes@...gle.com>,
	"jiaqiyan@...gle.com" <jiaqiyan@...gle.com>,
	"Jon.Grimm@....com" <Jon.Grimm@....com>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"naoya.horiguchi@....com" <naoya.horiguchi@....com>,
	"james.morse@....com" <james.morse@....com>,
	"jthoughton@...gle.com" <jthoughton@...gle.com>,
	"somasundaram.a@....com" <somasundaram.a@....com>,
	"erdemaktas@...gle.com" <erdemaktas@...gle.com>,
	"pgonda@...gle.com" <pgonda@...gle.com>,
	"duenwen@...gle.com" <duenwen@...gle.com>,
	"gthelen@...gle.com" <gthelen@...gle.com>,
	"wschwartz@...erecomputing.com" <wschwartz@...erecomputing.com>,
	"dferguson@...erecomputing.com" <dferguson@...erecomputing.com>,
	"wbs@...amperecomputing.com" <wbs@...amperecomputing.com>,
	"nifan.cxl@...il.com" <nifan.cxl@...il.com>,
	tanxiaofei <tanxiaofei@...wei.com>,
	"Zengtao (B)" <prime.zeng@...ilicon.com>,
	Roberto Sassu <roberto.sassu@...wei.com>,
	"kangkang.shen@...urewei.com" <kangkang.shen@...urewei.com>,
	wanghuiqiang <wanghuiqiang@...wei.com>,
	Linuxarm <linuxarm@...wei.com>
Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature

On Thu, Jan 09, 2025 at 02:24:33PM +0000, Jonathan Cameron wrote:
> To my thinking that would fail the test of being an intuitive interface.
> To issue a repair command requires that multiple attributes be configured
> before triggering the actual repair.
> 
> Think of it as setting the coordinates of the repair in a high dimensional
> space.

Why?

You can write every attribute in its separate file and have a "commit" or
"start" file which does that.

Or you can designate a file which starts the process. This is how I'm
injecting errors on x86:

see readme_msg here: arch/x86/kernel/cpu/mce/inject.c

More specifically:

"flags:\t Injection type to be performed. Writing to this file will trigger a\n"
"\t real machine check, an APIC interrupt or invoke the error decoder routines\n"
"\t for AMD processors.\n"

So you set everything else, and as the last step you set the injection type
*and* you also trigger it with this one write.

> Sure. In this case the addition of min/max was perhaps a wrong response to
> your request for a way to those ranges rather than just rejecting a write
> of something out of range as earlier version did.
> 
> We can revisit in future if range discovery becomes necessary.  Personally
> I don't think it is given we are only taking these actions in response error
> records that give us precisely what to write and hence are always in range.

My goal here was to make this user-friendly. Because you need some way of
knowing what valid ranges are and in order to trigger the repair, if it needs
to happen for a range.

Or, you can teach the repair logic to ignore invalid ranges and "clamp" things
to whatever makes sense.

Again, I'm looking at it from the usability perspective. I haven't actually
needed this scrub+repair functionality yet to know whether the UI makes sense.
So yeah, collecting some feedback from real-life use cases would probably give
you a lot better understanding of how that UI should be designed... perhaps
you won't ever need the ranges, whow knows.

So yes, preemptively designing stuff like that "in the dark" is kinda hard.
:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ