lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YvkJjsdlDIcerqLg@araj-dh-work>
Date:   Sun, 14 Aug 2022 14:41:18 +0000
From:   Ashok Raj <ashok.raj@...el.com>
To:     Andrew Cooper <Andrew.Cooper3@...rix.com>
CC:     Andy Lutomirski <luto@...nel.org>, Borislav Petkov <bp@...en8.de>,
        "Thomas Gleixner" <tglx@...utronix.de>,
        Tony Luck <tony.luck@...el.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        "luto@...capital.net" <luto@...capital.net>,
        Tom Lendacky <thomas.lendacky@....com>,
        Ashok Raj <ashok.raj@...el.com>
Subject: Re: [PATCH 5/5] x86/microcode: Handle NMI's during microcode update.

Hi Andrew,

On Sun, Aug 14, 2022 at 11:58:17AM +0000, Andrew Cooper wrote:
> >> If I were implementing this, I would rendezvous via stop_machine as usual.  Then I would set a flag or install a handler indicating that we are doing a microcode update, send NMI-to-self, and rendezvous in the NMI handler and do the update.
> > Well, that is exactly what I did for the first attempt. The code looked so
> > beautiful in the eyes of the creator :-) but somehow I couldn't get it to
> > not lock up.
> 
> So the way we do this in Xen is to rendezvous in stop machine, then have
> only the siblings self-NMI.  The primary threads don't need to be in NMI
> context, because the WRMSR to trigger the update *is* atomic with NMIs.
> 
> However, you do need to make sure that the NMI wait loop knows not to
> wait for primary threads, otherwise you can deadlock when taking an NMI
> on a primary thread between setting up the NMI handler and actually
> issuing the update.
> 

I'm almost sure that was the deadlock I ran into. You are correct, the
primary thread doesn't need to be in NMI, since once the wrmsr starts,  it
can't be interrupted.

But the primary needs to wait until its own siblings have dropped into NMI.
Before proceeding to perform wrmsr.

in stop_machine() handler, primary thread waits for its thread siblings to
enter NMI and report itself. Siblings will simply self IPI and then proceed
to wait for exit_sync

then primary does the wrmsr flow
clears the wait_cpus mask so that secondary inside NMI hander can release
itself

resync at exit rendezvous.

I have this coded, will test and repost.

Cheers,
Ashok

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ