[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260113095233.GBaWYV4eSjNx9YaGbC@fat_crate.local>
Date: Tue, 13 Jan 2026 10:52:33 +0100
From: Borislav Petkov <bp@...en8.de>
To: lirongqing <lirongqing@...du.com>
Cc: Thomas Gleixner <tglx@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>, Tony Luck <tony.luck@...el.com>,
Yazen Ghannam <yazen.ghannam@....com>,
Nikolay Borisov <nik.borisov@...e.com>,
Qiuxu Zhuo <qiuxu.zhuo@...el.com>,
Avadhut Naik <avadhut.naik@....com>, linux-kernel@...r.kernel.org,
linux-edac@...r.kernel.org
Subject: Re: [v2 PATCH] x86/mce: Fix timer interval adjustment after logging
a MCE event
On Tue, Jan 13, 2026 at 02:05:06AM -0500, lirongqing wrote:
> From: Li RongQing <lirongqing@...du.com>
>
> Since commit 011d82611172 ("RAS: Add a Corrected Errors Collector"),
> mce_timer_fn() has incorrectly determined whether to adjust the
> timer interval. The issue arises because mce_notify_irq() now always
> returns false when called from the timer path, since the polling code
> never sets bit 0 of mce_need_notify. This prevents proper adjustment of
> the timer interval based on whether MCE events were logged.
>
> The mce_notify_irq() is called from two contexts:
> 1. Early notifier block - correctly sets mce_need_notify
> 2. Timer function - never sets mce_need_notify, making it a noop
> (though logged errors are still processed through mce_log()->
> x86_mce_decoder_chain -> early notifier).
>
> Fix this by modifying machine_check_poll() to return a boolean indicating
> whether any MCE was logged, and updating mc_poll_banks() and related
> functions to propagate this return value. Then, mce_timer_fn() can use
> this direct return value instead of relying on mce_notify_irq() for
> timer interval decisions.
>
> This ensures the timer interval is correctly reduced when MCE events are
> logged and increased when no events occur.
>
> Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
> Signed-off-by: Li RongQing <lirongqing@...du.com>
> ---
> Diff with v1: rewrite commit message
>
> arch/x86/include/asm/mce.h | 2 +-
> arch/x86/kernel/cpu/mce/core.c | 17 +++++++++++------
> arch/x86/kernel/cpu/mce/intel.c | 8 ++++++--
> arch/x86/kernel/cpu/mce/internal.h | 2 +-
> 4 files changed, 19 insertions(+), 10 deletions(-)
We're discussing the issue here:
https://lore.kernel.org/r/268e2f0512db435685af987a2ba6893c@baidu.com
Why are you sending another patch before we haven't agreed on whether there's
an issue in the first place?!
NAK!
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists