linux-kernel - Re: [Patch v3 Part2 3/9] x86/microcode/intel: Fix collect_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y9qBmugSm+o5u4pq@a4bf019067fa.jf.intel.com>
Date:   Wed, 1 Feb 2023 07:13:30 -0800
From:   Ashok Raj <ashok.raj@...el.com>
To:     Borislav Petkov <bp@...en8.de>
CC:     "Luck, Tony" <tony.luck@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "Schofield, Alison" <alison.schofield@...el.com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        Tom Lendacky <thomas.lendacky@....com>,
        "Stefan Talpalaru" <stefantalpalaru@...oo.com>,
        David Woodhouse <dwmw2@...radead.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Jonathan Corbet <corbet@....net>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Peter Zilstra <peterz@...radead.org>,
        "Lutomirski, Andy" <luto@...nel.org>,
        "andrew.cooper3@...rix.com" <andrew.cooper3@...rix.com>,
        "Ostrovsky, Boris" <boris.ostrovsky@...cle.com>,
        Martin Pohlack <mpohlack@...zon.de>,
        Ashok Raj <ashok.raj@...el.com>
Subject: Re: [Patch v3 Part2 3/9] x86/microcode/intel: Fix collect_cpu_info()
 to reflect current microcode

On Wed, Feb 01, 2023 at 01:53:32PM +0100, Borislav Petkov wrote:
> On Tue, Jan 31, 2023 at 10:43:23PM +0000, Luck, Tony wrote:
> > In an ideal world yes. But what if T1 arrives here and tries to do the
> > update while T0, which has returned out of the microcode update
> > code and could be doing anything, happen to be doing WRMSR(some MSR
> > that the ucode update is tinkering with).
> > 
> > Now T0 explodes (not literally, I hope!) but does something crazy because
> > it was in the middle of some microcode flow that got updated between two
> > operations.
> 
> So first of all, I'm wondering whether the scenario you're chasing is
> something completely hypothetical or you're actually thinking of
> something concrete which has actually happened or there's high potential
> for it.
> 
> In that case, that late patching sync algorithm would need to be made
> more robust to handle cases like that.

That's correct. But fundamentally we sent the sibling down the
apply_microcode() path just to make sure the per-thread info is updated.

It appears the code is using a side effect that the revision got updated
even though we don't actually intend to perform a wrmsr on the sibling
in the normal case that primary completes the update.

If the purpose is only to update the revision, using the collect_cpu_info()
which seems more appropriate for that purpose, and doesn't have any
implied issues with using a wrmsr flow. It's not broken today, but the code
isn't future proof. Calling the revision update only keeps those questions
at bay.

I think this is what Thomas implied to cleanup in his comments. 

> 
> Because from what I'm reading above, this doesn't sound like the
> reporting is wrong only but more like, if T0 fails the update and T1
> gets to do that update for a change, then crap can happen.
> 
> Which means, our update dance cannot handle that case properly.
> 

It doesn't need to if we don't do an apply_microcode() for the sibling.

Cheers,
Ashok