[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250911155314.GA108087@yaz-khff2.amd.com>
Date: Thu, 11 Sep 2025 11:53:14 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Nikolay Borisov <nik.borisov@...e.com>
Cc: x86@...nel.org, Tony Luck <tony.luck@...el.com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
linux-kernel@...r.kernel.org, linux-edac@...r.kernel.org,
Smita.KoralahalliChannabasappa@....com,
Qiuxu Zhuo <qiuxu.zhuo@...el.com>, linux-acpi@...r.kernel.org
Subject: Re: [PATCH v6 10/15] x86/mce/amd: Enable interrupt vectors once
per-CPU on SMCA systems
On Thu, Sep 11, 2025 at 01:22:10PM +0300, Nikolay Borisov wrote:
>
>
> On 9/8/25 18:40, Yazen Ghannam wrote:
> > Scalable MCA systems have a per-CPU register that gives the APIC LVT
> > offset for the thresholding and deferred error interrupts.
> >
> > Currently, this register is read once to set up the deferred error
> > interrupt and then read again for each thresholding block. Furthermore,
> > the APIC LVT registers are configured each time, but they only need to
> > be configured once per-CPU.
> >
> > Move the APIC LVT setup to the early part of CPU init, so that the
> > registers are set up once. Also, this ensures that the kernel is ready
> > to service the interrupts before the individual error sources (each MCA
> > bank) are enabled.
> >
> > Apply this change only to SMCA systems to avoid breaking any legacy
> > behavior. The deferred error interrupt is technically advertised by the
> > SUCCOR feature. However, this was first made available on SMCA systems.
> > Therefore, only set up the deferred error interrupt on SMCA systems and
> > simplify the code.
> >
> > Guidance from hardware designers is that the LVT offsets provided from
> > the platform should be used. The kernel should not try to enforce
> > specific values. However, the kernel should check that an LVT offset is
> > not reused for multiple sources.
> >
> > Therefore, remove the extra checking and value enforcement from the MCE
> > code. The "reuse/conflict" case is already handled in
> > setup_APIC_eilvt().
> >
> > Tested-by: Tony Luck <tony.luck@...el.com>
> > Reviewed-by: Tony Luck <tony.luck@...el.com>
> > Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
> > ---
> >
> > Notes:
> > Link:
> > https://lore.kernel.org/r/20250825-wip-mca-updates-v5-15-865768a2eef8@amd.com
> > v5->v6:
> > * Applied "bools to flags" and other fixups from Boris.
> > v4->v5:
> > * Added back to set.
> > * Updated commit message with more details.
> > v3->v4:
> > * Dropped from set.
> > v2->v3:
> > * Add tags from Tony.
> > v1->v2:
> > * Use new per-CPU struct.
> > * Don't set up interrupt vectors.
> >
> > arch/x86/kernel/cpu/mce/amd.c | 121 ++++++++++++++++++------------------------
> > 1 file changed, 53 insertions(+), 68 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
> > index 1b1b83b3aef9..a6f5c9339d7c 100644
> > --- a/arch/x86/kernel/cpu/mce/amd.c
> > +++ b/arch/x86/kernel/cpu/mce/amd.c
> > @@ -43,9 +43,6 @@
> > /* Deferred error settings */
> > #define MSR_CU_DEF_ERR 0xC0000410
>
> nit: While touching this code why not finally rename this in line with the
> APM, section 9.3.1.4: MCA_INTR_CFG
>
> Perhaps as a separate patch. I see that you did send a patch containing this
> rename:
> https://lore.kernel.org/all/20231118193248.1296798-13-yazen.ghannam@amd.com/
> But I guess it didn't land.
Yep, thanks for noticing. :)
IIRC, I tried to reduce this set down to (mostly) functional changes.
I think that there is still more worthwhile refactoring to do.
Thanks,
Yazen
Powered by blists - more mailing lists