[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250715150947.GAaHZvOxsvEvALZNDd@fat_crate.local>
Date: Tue, 15 Jul 2025 17:09:47 +0200
From: Borislav Petkov <bp@...en8.de>
To: Shuai Xue <xueshuai@...ux.alibaba.com>
Cc: Breno Leitao <leitao@...ian.org>, Alexander Graf <graf@...zon.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Peter Gonda <pgonda@...gle.com>, "Luck, Tony" <tony.luck@...el.com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Len Brown <lenb@...nel.org>, James Morse <james.morse@....com>,
"Moore, Robert" <robert.moore@...el.com>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"acpica-devel@...ts.linux.dev" <acpica-devel@...ts.linux.dev>,
"kernel-team@...a.com" <kernel-team@...a.com>
Subject: Re: [PATCH] ghes: Track number of recovered hardware errors
On Tue, Jul 15, 2025 at 09:46:03PM +0800, Shuai Xue wrote:
> For the purpose of counting, how about using the cmdline of rasdaemon?
That would mean you have to run rasdaemon on those machines before they
explode and then carve out the rasdaemon db from the coredump (this is
post-mortem analysis).
I would love for rasdaemon to log over the network and then other tools can
query those centralized logs but that has its own challenges...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists