lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Aug 2022 07:23:19 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Ashok Raj <ashok.raj@...el.com>, Borislav Petkov <bp@...en8.de>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tony Luck <tony.luck@...el.com>,
        Dave Hansen <dave.hansen@...el.com>,
        LKML Mailing List <linux-kernel@...r.kernel.org>,
        X86-kernel <x86@...nel.org>,
        Andy Lutomirski <luto@...capital.net>,
        Tom Lendacky <thomas.lendacky@....com>,
        Jacon Jun Pan <jacob.jun.pan@...el.com>
Subject: Re: [PATCH v3 3/5] x86/microcode: Avoid any chance of MCE's during
 microcode update

On 8/17/22 08:06, Ashok Raj wrote:
> On Wed, Aug 17, 2022 at 04:19:40PM +0200, Borislav Petkov wrote:
>> On Wed, Aug 17, 2022 at 12:30:49PM +0000, Ashok Raj wrote:
>>> You will find out when system returns after reboot and hopefully wasn't
>>> promoted to a cold-boot which will loose MCE banks.
>> Not good enough!
> I probably misread your question.. are you suggesting we add some WARN when
> we initiate late_load? I thought you were asking if the HW must signal
> something and OS should log when an MCE happens if MCIP=1
>
>
>> This should issue a warning in dmesg that a potential MCE while update
>> is running would cause a lockup. That is if we don't disable MCE around
>> it.
>>
>> If we decide to disable MCE, it should say shutdown.
> Ok, that clarifies it.. "IF we choose to set MCIP=1, we should tell users
> that hell can break loose, get under the table" :-)
>
>>> Meaning deal with the effect of a really rare MCE. Rather than trying to
>>> avoid it. Taking the MCE is more important than finishing the update,
>>> and loosing what the error signaled was trying to convey.
>> Right now I'm inclined to not do anything and warn of a potential rare
>> situation.
> Encouraging.. So I'll drop that patch from the list next time around.


If I followed all this correctly, I agree. If we set MCIP to force a 
crash if we get MCE, then we are guaranteed to crash.  If we don't, then 
we might crash.


An imperfect alternative would be to set a (percpu?) flag that we're 
doing a ucode update and then detect that flag early in the MCE handler 
and warn very loudly.  This seems like it will give us the best chance 
of getting a useful diagnostic.

Powered by blists - more mailing lists