lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CY8PR11MB7134D97F82DC001AE009637889E32@CY8PR11MB7134.namprd11.prod.outlook.com>
Date: Fri, 24 Jan 2025 10:43:42 +0000
From: "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>
To: Nikolay Borisov <nik.borisov@...e.com>, "linux-edac@...r.kernel.org"
	<linux-edac@...r.kernel.org>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>
Subject: RE: [RESEND PATCH 3/3] x86/mce: Make mce_notify_irq() depend on
 CONFIG_X86_MCELOG_LEGACY

> From: Nikolay Borisov <nik.borisov@...e.com>
> [...]
> >> --- a/arch/x86/kernel/cpu/mce/core.c
> >> +++ b/arch/x86/kernel/cpu/mce/core.c
> >> @@ -591,6 +591,7 @@ EXPORT_SYMBOL_GPL(mce_is_correctable);
> >>    */
> >>   static int mce_notify_irq(void)
> >>   {
> >> +#ifdef CONFIG_X86_MCELOG_LEGACY
> >>   	/* Not more than two messages every minute */
> >>   	static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
> >>
> >> @@ -602,7 +603,7 @@ static int mce_notify_irq(void)
> >>
> >
> > The message printed inside this function should not depend on
> > CONFIG_X86_MCELOG_LEGACY.  User-space tools/scripts might look for
> > this message to detect machine events. It is also useful for debugging
> purposes.
> 
> The thing is if MCELOG_LEGACY is turned off then mce_work_trigger is a
> noop, hence nothing is really logged which makes this message somewhat
> bogus. After all the early handler's job is to log to userspace, if we don't log
> anything no need to spam the kernel log.

Currently, some customers have reported that the Intel EDAC driver didn't
report errors on some memory DIMMs. The print message here helped
me confirm whether the MCE event originated from the x86/mce code or
if the MCE event was lost somewhere in the EDAC driver.

IMHO, it would be better to keep this print message here, or update it a bit like below 
if !CONFIG_X86_MCELOG_LEGACY:

   pr_info(HW_ERR "Machine check events generated\n");

Thanks!
-Qiuxu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ