linux-kernel - Re: [PATCH 5/5] x86/microcode: Handle NMI's during microcode update.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Yvhj5YaLdJnF60uR@araj-dh-work>
Date:   Sun, 14 Aug 2022 02:54:29 +0000
From:   Ashok Raj <ashok.raj@...el.com>
To:     Andy Lutomirski <luto@...nel.org>
CC:     Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Tony Luck" <tony.luck@...el.com>,
        Dave Hansen <dave.hansen@...el.com>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        "luto@...capital.net" <luto@...capital.net>,
        Tom Lendacky <thomas.lendacky@....com>,
        Andrew Cooper <andrew.cooper3@...rix.com>,
        "Ashok Raj" <ashok.raj@...el.com>
Subject: Re: [PATCH 5/5] x86/microcode: Handle NMI's during microcode update.

On Sat, Aug 13, 2022 at 05:13:13PM -0700, Andy Lutomirski wrote:
> 
> 
> On Sat, Aug 13, 2022, at 3:38 PM, Ashok Raj wrote:
> > Microcode updates need a guarantee that the thread sibling that is waiting
> > for the update to finish on the primary core will not execute any
> > instructions until the update is complete. This is required to guarantee
> > any MSR or instruction that's being patched will be executed before the
> > update is complete.
> >
> > After the stop_machine() rendezvous, an NMI handler is registered. If an
> > NMI were to happen while the microcode update is not complete, the
> > secondary thread will spin until the ucode update state is cleared.
> >
> > Couple of choices discussed are:
> >
> > 1. Rendezvous inside the NMI handler, and also perform the update from
> >    within the handler. This seemed too risky and might cause instability
> >    with the races that we would need to solve. This would be a difficult
> >    choice.
> 
> I prefer choice 1.  As I understand it, Xen has done this for a while to good effect.
> 
> If I were implementing this, I would rendezvous via stop_machine as usual.  Then I would set a flag or install a handler indicating that we are doing a microcode update, send NMI-to-self, and rendezvous in the NMI handler and do the update.

Well, that is exactly what I did for the first attempt. The code looked so
beautiful in the eyes of the creator :-) but somehow I couldn't get it to
not lock up. But the new implementation seemed to be more efficient. We do
nothing until the secondary drops in the NMI handler, and then hold them
hostage right there.

I thought this was slightly improved, in the sense we don't take extra hit
for sending and receiving interrupts. 

In my first attempt I didn't only rendezvous between threads of the core.
Everything looked great but just didn't work for me in time.

Cheers,
Ashok