linux-kernel - Re: Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <fe0af2c8-92c8-8d66-e9f3-5a50d64837e5@citrix.com>
Date:   Tue, 23 Jun 2020 16:32:22 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Joerg Roedel <jroedel@...e.de>
CC:     Andy Lutomirski <luto@...nel.org>, Joerg Roedel <joro@...tes.org>,
        "Dave Hansen" <dave.hansen@...el.com>,
        Tom Lendacky <Thomas.Lendacky@....com>,
        "Mike Stunes" <mstunes@...are.com>,
        Dan Williams <dan.j.williams@...el.com>,
        "Dave Hansen" <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>, Juergen Gross <JGross@...e.com>,
        Jiri Slaby <jslaby@...e.cz>, Kees Cook <keescook@...omium.org>,
        kvm list <kvm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Thomas Hellstrom <thellstrom@...are.com>,
        Linux Virtualization <virtualization@...ts.linux-foundation.org>,
        X86 ML <x86@...nel.org>,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP
 from userspace)

On 23/06/2020 16:16, Peter Zijlstra wrote:
> On Tue, Jun 23, 2020 at 04:49:40PM +0200, Joerg Roedel wrote:
>>> We're talking about the 3rd case where the only reason things 'work' is
>>> because we'll have to panic():
>>>
>>>  - #MC
>> Okay, #MC is special and can only be handled on a best-effort basis, as
>> #MC could happen anytime, also while already executing the #MC handler.
> I think the hardware has a MCE-mask bit somewhere. Flaky though because
> clearing it isn't 'atomic' with IRET, so there's a 'funny' window.

MSR_MCG_STATUS.MCIP, and yes - any #MC before that point will
immediately Shutdown.  Any #MC between that point and IRET will clobber
its IST stack and end up sad.

> It also interacts really bad with the NMI handler. If we get an #MC
> early in the NMI, where we hard-rely on the NMI-mask being set to set-up
> the recursion stack, then the #MC IRET will clear the NMI-mask, and
> we're toast.
>
> Andy has wild and crazy ideas, but I don't think we need more crazy
> here.

Want, certainly not.  Need, maybe :-/
>>>  - #DB with BUS LOCK DEBUG EXCEPTION
>> If I understand the problem correctly, this can be solved by moving off
>> the IST stack to the current task stack in the #DB handler, like I plan
>> to do for #VC, no?
> Hmm, probably. Would take a bit of care, but should be doable.

Andy and I have spent some time considering this.

Having exactly one vector move off IST and onto an in-use task-stack is
doable, I think, so long as it can sort out self-recursion protections.

Having more than one vector do this is very complicated.  You've got to
take care to walk up the list of IST-nesting to see if any interrupted
context is in the middle of trying to copy themselves onto the stack, so
you don't clobber the frame they're in the middle of copying.

~Andrew