linux-kernel - Re: [PATCH v2 0/2] vmcoreinfo: Expose hardware error recovery statistics via sysfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20260210104612.5547717cb6b5da794d9c4724@linux-foundation.org>
Date: Tue, 10 Feb 2026 10:46:12 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Breno Leitao <leitao@...ian.org>
Cc: bhe@...hat.com, linux-kernel@...r.kernel.org, kexec@...ts.infradead.org,
 linux-arm-kernel@...ts.infradead.org, linux-acpi@...r.kernel.org,
 dyoung@...hat.com, tony.luck@...el.com, xueshuai@...ux.alibaba.com,
 vgoyal@...hat.com, zhiquan1.li@...el.com, olja@...a.com,
 kernel-team@...a.com
Subject: Re: [PATCH v2 0/2] vmcoreinfo: Expose hardware error recovery
 statistics via sysfs

On Tue, 10 Feb 2026 01:11:41 -0800 Breno Leitao <leitao@...ian.org> wrote:

> Hello Andrew,
> 
> On Mon, Feb 02, 2026 at 06:27:38AM -0800, Breno Leitao wrote:
> > The kernel already tracks recoverable hardware errors (CPU, memory, PCI,
> > CXL, etc.) in the hwerr_data array for vmcoreinfo crash dump analysis.
> > However, this data is only accessible after a crash.
> >
> > This series adds a sysfs directory at /sys/kernel/hwerr_recovery_stats/ to
> > expose these statistics at runtime, allowing monitoring tools to track
> > hardware health without requiring a kernel crash.
> >
> > The directory contains one file per error subsystem:
> >   /sys/kernel/hwerr_recovery_stats/{cpu, memory, pci, cxl, others}
> >
> > Each file contains a single integer representing the error count.
> >
> > This is useful for:
> > - Proactive detection of failing hardware components
> > - Time-series tracking of recoverable errors
> > - System health monitoring in cloud environments
> 
> Is there a chance this could be included in the 6.20 merge window?

During the 7.0 merge window?  Sure.  I'll be taking a look at this (and
a whole lot more) after 7.0-rc1 is released.