[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9q+WL+hnS5ZymDj@a4bf019067fa.jf.intel.com>
Date: Wed, 1 Feb 2023 11:32:40 -0800
From: Ashok Raj <ashok.raj@...el.com>
To: Dave Hansen <dave.hansen@...el.com>
CC: Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
Ingo Molnar <mingo@...nel.org>,
Tony Luck <tony.luck@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Tom Lendacky <thomas.lendacky@....com>,
Stefan Talpalaru <stefantalpalaru@...oo.com>,
David Woodhouse <dwmw2@...radead.org>,
"Benjamin Herrenschmidt" <benh@...nel.crashing.org>,
Jonathan Corbet <corbet@....net>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Peter Zilstra <peterz@...radead.org>,
Andy Lutomirski <luto@...nel.org>,
Andrew Cooper <Andrew.Cooper3@...rix.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Martin Pohlack <mpohlack@...zon.de>,
Ashok Raj <ashok.raj@...el.com>
Subject: Re: [Patch v3 Part2 3/9] x86/microcode/intel: Fix collect_cpu_info()
to reflect current microcode
On Wed, Feb 01, 2023 at 11:13:58AM -0800, Dave Hansen wrote:
> On 1/30/23 13:39, Ashok Raj wrote:
> > Currently collect_cpu_info() is only returning what was cached earlier
> > instead of reading the current revision from the proper MSR.
> >
> > Collect the current revision and report that value instead of reflecting
> > what was cached in the past.
> >
> > [TBD:
> > Need to change microcode/amd.c. I didn't quite follow the logic since
> > it reports the revision from the patch file, instead of reporting the
> > real PATCH_LEVEL MSR.
> >
> > Untested on AMD.
> > ]
>
> This thread is meandering a bit. I think it's because this changelog
> doesn't have a problem statement. It's hard to agree on a patch being a
> solution to anything if we haven't first agreed on the problem.
>
> What is the problem?
I alluded here.. But yes, clearly missed in the commit log.
https://lore.kernel.org/lkml/Y9mW7EiL%2FBpYFLWn@a4bf019067fa.jf.intel.com/
Thomas alluded here https://lore.kernel.org/lkml/87y1pygiyf.ffs@tglx/
that error handling in __reload_late()::wait_for_siblings() code patch is
completely broken.
This is one that I "assumed" he was referring to, since all we need is to
update the current revision, but we end up depending on the behavior of
apply_microcode() and that might accidentally have some side effects.
Instead only call the collect_cpu_info() and allow that to update the
per-cpu revision instead. And there is no risk in performing that vs
accidentally letting it fall through with an apply_microcode() that might
have risks.
>
> What does this "fix"?
The code performs this delicate late-load dance to prevent sibling threads
to be quiet while performing the update.
At wait_for_siblings() when all threads arrive, then the sibling does the
apply_microcode() which seems wrong.
Powered by blists - more mailing lists