linux-kernel - Re: [RFC v1 05/26] x86/traps: Add #VE support for TDX guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YCbq+UEMIsE0NIWI@google.com>
Date:   Fri, 12 Feb 2021 12:54:17 -0800
From:   Sean Christopherson <seanjc@...gle.com>
To:     Dave Hansen <dave.hansen@...el.com>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Raj Ashok <ashok.raj@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [RFC v1 05/26] x86/traps: Add #VE support for TDX guest

On Fri, Feb 12, 2021, Dave Hansen wrote:
> On 2/12/21 12:37 PM, Sean Christopherson wrote:
> > There needs to be a mechanism for lazy/deferred/on-demand acceptance of pages.
> > E.g. pre-accepting every page in a VM with hundreds of GB of memory will be
> > ridiculously slow.
> > 
> > #VE is the best option to do that:
> > 
> >   - Relatively sane re-entrancy semantics.
> >   - Hardware accelerated.
> >   - Doesn't require stealing an IRQ from the guest.
> 
> TDX already provides a basic environment for the guest when it starts
> up.  The guest has some known, good memory.  The guest also has a very,
> very clear understanding of which physical pages it uses and when.  It's
> staged, of course, as decompression happens and the guest comes up.
> 
> But, the guest still knows which guest physical pages it accesses and
> when.  It doesn't need on-demand faulting in of non-accepted pages.  It
> can simply decline to expose non-accepted pages to the wider system
> before they've been accepted.
> 
> It would be nuts to merrily free non-accepted pages into the page
> allocator and handle the #VE fallout as they're touched from
> god-knows-where.
> 
> I don't see *ANY* case for #VE to occur inside the guest kernel, outside
> of *VERY* narrow places like copy_from_user().  Period.  #VE from ring-0
> is not OK.
> 
> So, no, #VE is not the best option.  No #VE's in the first place is the
> best option.

Ah, I see what you're thinking.

Treating an EPT #VE as fatal was also considered as an option.  IIUC it was
thought that finding every nook and cranny that could access a page, without
forcing the kernel to pre-accept huge swaths of memory, would be very difficult.
It'd be wonderful if that's not the case.