[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3eaf01db-6f51-4125-a4bd-bc54c6576e28@suse.com>
Date: Tue, 13 Jan 2026 21:31:19 +0200
From: Nikolay Borisov <nik.borisov@...e.com>
To: "Luck, Tony" <tony.luck@...el.com>, Borislav Petkov <bp@...en8.de>,
"Li, Rongqing" <lirongqing@...du.com>
Cc: Thomas Gleixner <tglx@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, "x86@...nel.org"
<x86@...nel.org>, "H . Peter Anvin" <hpa@...or.com>,
Yazen Ghannam <yazen.ghannam@....com>, "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>,
Avadhut Naik <avadhut.naik@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>
Subject: Re: 答复: 答复: 答复: [外部邮件] Re: [PATCH] x86/mce: Fix timer interval adjustment after logging a MCE event
On 13.01.26 г. 20:53 ч., Luck, Tony wrote:
>>> The comment in mce_timer_fn says to adjust the polling interval, but
>>> I notice the kernel log always shows an MCE log every 5 minutes. Is this
>>> normal?
>>
>> Use git annotate to figure out which patch added this comment and in context
>> of what and that'll tell you why.
>>
>> As to the 5 minutes, look at how the check interval gets established.
>
> Once upon a time the polling interval started out at 5 minutes, but the
> interval was halved each time an error was found (so interval went
> 150s, 75s, 37s, ... down to 1s). If no error was found, then the interval
> was doubled (going back up to 300s).
>
> This is described in the comment:
>
> /*
> * Alert userspace if needed. If we logged an MCE, reduce the polling
> * interval, otherwise increase the polling interval.
> */
>
> It seems that the kernel isn't doing that today. Polling at a fixed 300 seconds
> event though errors are being found and logged. Interesting that the timestamps
> are 327.68 seconds apart, rather than 300 and change. So there is some strange
> stuff going on.
I think Li Rongqing patch does exactly that, since it predicates the
halving/doubling of the interval based on whether an error was found and
not whether it was reported to user space (what mce_notify_irq() ) does.
Both concepts seems to be independent and the former being the core one
we care about w.r.t to the decision how to adjust the interval, no ?
<snip>
Powered by blists - more mailing lists