linux-kernel - Re: [PATCH v3 00/21] Enable CET Virtualization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230720080357.GA3569127@hirez.programming.kicks-ass.net>
Date:   Thu, 20 Jul 2023 10:03:57 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Pankaj Gupta <pankaj.gupta.linux@...il.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Weijiang Yang <weijiang.yang@...el.com>, pbonzini@...hat.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org, rppt@...nel.org,
        binbin.wu@...ux.intel.com, rick.p.edgecombe@...el.com,
        john.allen@....com, Chao Gao <chao.gao@...el.com>,
        Andrew Cooper <Andrew.Cooper3@...rix.com>
Subject: Re: [PATCH v3 00/21] Enable CET Virtualization

On Thu, Jul 20, 2023 at 07:26:04AM +0200, Pankaj Gupta wrote:
> > > My understanding is that PL[0-2]_SSP are used only on transitions to the
> > > corresponding privilege level from a *different* privilege level.  That means
> > > KVM should be able to utilize the user_return_msr framework to load the host
> > > values.  Though if Linux ever supports SSS, I'm guessing the core kernel will
> > > have some sort of mechanism to defer loading MSR_IA32_PL0_SSP until an exit to
> > > userspace, e.g. to avoid having to write PL0_SSP, which will presumably be
> > > per-task, on every context switch.
> > >
> > > But note my original wording: **If that's necessary**
> > >
> > > If nothing in the host ever consumes those MSRs, i.e. if SSS is NOT enabled in
> > > IA32_S_CET, then running host stuff with guest values should be ok.  KVM only
> > > needs to guarantee that it doesn't leak values between guests.  But that should
> > > Just Work, e.g. KVM should load the new vCPU's values if SHSTK is exposed to the
> > > guest, and intercept (to inject #GP) if SHSTK is not exposed to the guest.
> > >
> > > And regardless of what the mechanism ends up managing SSP MSRs, it should only
> > > ever touch PL0_SSP, because Linux never runs anything at CPL1 or CPL2, i.e. will
> > > never consume PL{1,2}_SSP.
> >
> > To clarify, Linux will only use SSS in FRED mode -- FRED removes CPL1,2.
> 
> Trying to understand more what prevents SSS to enable in pre FRED, Is
> it better #CP exception
> handling with other nested exceptions?

SSS took the syscall gap and made it worse -- as in *way* worse.

To top it off, the whole SSS busy bit thing is fundamentally
incompatible with how we manage to survive nested exceptions in NMI
context.

Basically, the whole x86 exception / stack switching logic was already
borderline impossible (consider taking an MCE in the early NMI path
where we set up, but have not finished, the re-entrancy stuff), and
pushed it over the edge and set it on fire.

And NMI isn't the only problem, the various new virt exceptions #VC and
#HV are on their own already near impossible, adding SSS again pushes
the whole thing into clear insanity.

There's a good exposition of the whole trainwreck by Andrew here:

  https://www.youtube.com/watch?v=qcORS8CN0ow

(that is, sorry for the youtube link, but Google is failing me in
finding the actual Google Doc that talk is based on, or even the slide
deck :/)



FRED solves all that by:

 - removing the stack gap, cc/ip/ss/sp/ssp/gs will all be switched
   atomically and consistently for every transition.

 - removing the non-reentrant IST mechanism and replacing it with stack
   levels

 - adding an explicit NMI latch

 - re-organising the actual shadow stacks and doing away with that busy
   bit thing (I need to re-read the FRED spec on this detail again).



Crazy as we are, we're not touching legacy/IDT SSS with a ten foot pole,
sorry.