[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB608389A75F07F37C79CB099CFC43A@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Thu, 3 Jul 2025 17:51:00 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Breno Leitao <leitao@...ian.org>
CC: "Rafael J. Wysocki" <rafael@...nel.org>, Len Brown <lenb@...nel.org>,
James Morse <james.morse@....com>, Borislav Petkov <bp@...en8.de>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kernel-team@...a.com" <kernel-team@...a.com>, "kbusch@...nel.org"
<kbusch@...nel.org>, "rmikey@...a.com" <rmikey@...a.com>
Subject: RE: [PATCH] acpi/ghes: add TAINT_MACHINE_CHECK on GHES panic path
> In summary, I don't think we should solve the problem of correlation
> here, given it is not straightforward. I just want to tag that the
> hardware got an error while the kernel was running, and the operator can
> use this information the way they want.
>
> Am I on the right track?
It seems that Rafael has just applied your patch for taint with the machine check
option. So you've got what you originally asked for.
If you want pursue the idea of a taint for GHES warnings, then create a
new patch that does that to spark discussion. Your case would be helped
if you have some data to back up the need for this. E.g. we have observed
"X% of recovered GHES errors are followed by a system crash within Y minutes".
If you don't have hard numbers, then at least some "We often/sometimes see
a crash shortly after a recovered GHES error that appears related."
There are only a few unused capital letters for the taint summary: H, Q,
V, Y, Z. None super-intuitive. Either pick one, or move into uncharted
territory of using lower case ('g'?).
-Tony
Powered by blists - more mailing lists