[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250902133015.GA18483@yaz-khff2.amd.com>
Date: Tue, 2 Sep 2025 09:30:15 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Borislav Petkov <bp@...en8.de>
Cc: x86@...nel.org, Tony Luck <tony.luck@...el.com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
linux-kernel@...r.kernel.org, linux-edac@...r.kernel.org,
Smita.KoralahalliChannabasappa@....com,
Qiuxu Zhuo <qiuxu.zhuo@...el.com>,
Nikolay Borisov <nik.borisov@...e.com>, linux-acpi@...r.kernel.org
Subject: Re: [PATCH v5 07/20] x86/mce: Reorder __mcheck_cpu_init_generic()
call
On Mon, Sep 01, 2025 at 07:07:41PM +0200, Borislav Petkov wrote:
> On Mon, Aug 25, 2025 at 05:33:04PM +0000, Yazen Ghannam wrote:
> > Move __mcheck_cpu_init_generic() after __mcheck_cpu_init_prepare_banks()
> > so that MCA is enabled after the first MCA polling event.
> >
> > This brings the MCA init flow closer to what is described in the x86 docs.
> >
> > The AMD PPRs say
> > "The operating system must initialize the MCA_CONFIG registers prior
> > to initialization of the MCA_CTL registers.
> >
> > The MCA_CTL registers must be initialized prior to enabling the error
> > reporting banks in MCG_CTL".
> >
> > However, the Intel SDM "Machine-Check Initialization Pseudocode" says
> > MCG_CTL first then MCi_CTL.
> >
> > But both agree that CR4.MCE should be set last.
> >
> > Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
> > ---
> >
> > Notes:
> > Link:
> > https://lore.kernel.org/r/52a37afe-c41b-4f20-bbdc-bddc3ae26260@suse.com
> >
> > v4->v5:
> > * New in v5.
> >
> > arch/x86/kernel/cpu/mce/core.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 0326fbb83adc..9cbf9e8c8060 100644
> > --- a/arch/x86/kernel/cpu/mce/core.c
> > +++ b/arch/x86/kernel/cpu/mce/core.c
> > @@ -2272,9 +2272,9 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
> >
> > mca_cfg.initialized = 1;
> >
> > - __mcheck_cpu_init_generic();
> > __mcheck_cpu_init_vendor(c);
> > __mcheck_cpu_init_prepare_banks();
> > + __mcheck_cpu_init_generic();
>
> With that flow we have now:
>
> 1. MCA_CTL
> 2. CR4.MCE
> 3. MCG_CTL
>
> So this is nothing like in the commit message above and the MC*_CTL flow is
> not what the SDM suggests and CR4.MCE is not last.
>
> So what's the point even here?
>
The main point is to initialize MCA after polling for leftover errors.
You're right that this change doesn't bring the code in sync with the
docs. I'll work on it more.
Moving CR4.MCE last should be okay, but deciding when to do MCG_CTL
needs some more thought. Maybe we can have an early call for Intel and
a late call for AMD?
Thanks,
Yazen
Powered by blists - more mailing lists