linux-kernel - Re: [RFC PATCH 00/10] KVM: selftests: Add support for test-selectable ucall implementations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YdTjnRZQID5IabK0@google.com>
Date:   Wed, 5 Jan 2022 00:17:33 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Michael Roth <michael.roth@....com>
Cc:     linux-kselftest@...r.kernel.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        Nathan Tempelman <natet@...gle.com>,
        Marc Orr <marcorr@...gle.com>,
        Steve Rutherford <srutherford@...gle.com>,
        Mingwei Zhang <mizhang@...gle.com>,
        Brijesh Singh <brijesh.singh@....com>,
        Tom Lendacky <thomas.lendacky@....com>,
        Varad Gautam <varad.gautam@...e.com>,
        Shuah Khan <shuah@...nel.org>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        Ricardo Koller <ricarkol@...gle.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        Christian Borntraeger <borntraeger@...ux.ibm.com>,
        Janosch Frank <frankja@...ux.ibm.com>,
        David Hildenbrand <david@...hat.com>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        Marc Zyngier <maz@...nel.org>,
        James Morse <james.morse@....com>,
        Alexandru Elisei <alexandru.elisei@....com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        kvmarm@...ts.cs.columbia.edu
Subject: Re: [RFC PATCH 00/10] KVM: selftests: Add support for
 test-selectable ucall implementations

On Tue, Jan 04, 2022, Michael Roth wrote:
> On Thu, Dec 30, 2021 at 09:11:12PM +0000, Sean Christopherson wrote:
> > not in-kernel.  That is bound to bite someone.  The only issue with SEV is the
> > address, not the VM-Exit mechanism.  That doesn't change with SEV-ES, SEV-SNP,
> > or TDX, as PIO and HLT will both get reflected as #VC/#VE, i.e. the guest side
> > needs to be updated to use VMGEXIT/TDCALL no matter what, at which point having
> > the hypercall request PIO emulation is just as easy as requesting HLT.
> 
> I'm not aware of any #VC handling needed for HLT in the case of
> SEV-ES/SEV-SNP. That was one of the reasons for the SEV tests using
> this ucall implementation.

Ah, you're right, HLT is an "automatic" exit and the CPU takes care of adjusting
RIP.  TDX is the only one that requires a hypercall.

> Of course, at some point, we'd want full support for PIO/MMIO/etc. in the #VC
> handler, but it's not something I'd planned on adding until after the SEV-SNP
> tests, since it seems like we'd need to import a bunch of intruction decoding
> code from elsewhere in the kernel, which is a lot of churn that's not
> immediately necessary for getting at least some basic tests in place. Since
> the HLT implementation is only 20 lines of code it seemed like a reasonable
> stop-gap until we start getting more CoCo tests in place. But the in-kernel
> APIC issue probably needs more consideration...
> 
> Perhaps for *just* PIO, the intruction decoding can be open-coded so it
> can be added to the initial #VC handler implementation, which would avoid the
> need for HLT implementation. I'll take a look at that.

PIO shouldn't require instruction decoding or a #VC handler.  What I was thinking
is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request
PIO instead of executing an OUT.  

> > I also don't like having to differentiate between a "shared" and "regular" ucall.
> > I kind of like having to explicitly pass the ucall object being used, but that
> > puts undue burden on simple single-vCPU tests.
> 
> I tried to avoid it, but I got hung up on that fact that pre-allocating
> arrays/lists of ucall structs needs to be done for each VM, and so we'd
> end up needing some way for a guest to identify which pool it's ucall
> struct should be allocated from. But you've gotten around that by just
> sync_global_to_guest()'ing for each pool at the time ucall_init() is
> called, so the guest only ever sees it's particular pool. Then the switch
> from writing GVA to writing GPA solves the translation problem. Nice.
> 
> > 
> > The inability to read guest private memory is really the only issue, and that can
> > be easily solved without completely revamping the ucall framework, and without
> > having to update a huge pile of tests to make them place nice with private memory.
> 
> I think the first 5 patches in this series are still relevant cleanups
> vs. having a complete standalone ucall implementation for each arch, and Andrew
> has also already started looking at other header cleanups related to
> patch #1, so maybe Paolo would still like to queue those. Would also
> provide a better starting point for having a centralized allocator for
> the ucall structs, which you hinted at wanting below.
> 
> But the subsequent patches that add the ucall_shared() interfaces should
> probably be set aside for now in favor of your proposal.
> 
> > 
> > This would also be a good opportunity to clean up the stupidity of tests having to
> > manually call ucall_init(), drop the unused/pointless @arg from ucall_init(), and
> > maybe even fix arm64's lurking landmine of not being SMP safe (the address is shared
> > by all vCPUs).
> 
> I thought you *didn't* want to update a huge pile of tests :) I suppose
> it's unavoidable, since with your proposal, having something like ucall_init()
> being called at some point is required, as opposed to the current
> implementation where it is optional. Are you intending to have it be
> called automatically by vm_create*()?

Yeah, I was thinking it could be done at the lowest level vm_create() helper.
We'll need to expand vm_create() (or add yet another layer to avoid modifying a
pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c
needs to create multiple concurrent VMs, but happily doesn't need ucall support.

> > To reduce the burden on tests and avoid ordering issues with creating vCPUs,
> > allocate a ucall struct for every possible vCPU when the VM is created and stuff
> > the GPA of the struct in the struct itself so that the guest can communicate the
> > GPA instead of the GVA.  Then confidential VMs just need to make all structs shared.
> 
> So a separate call like:
> 
>   ucall_make_shared(vm->ucall_list)
> 
> ? Might need some good documentation/assertions to make sure it gets
> called at the right place for confidential VMs, and may need some extra
> hooks in SEV selftest implementation for switching from private to shared
> after the memory has already been allocated, but seems reasonable.

Again, I was thinking that it would be done unconditionally by ucall_init(), i.e.
would be automatically handled by the selftest framework and would Just Work for
individual tests.

> > If all architectures have a way to access a vCPU ID, the ucall structs could be
> > stored as a simple array.  If not, a list based allocator would probably suffice.
> 
> I think list allocator is nicer, generating #VCs for both the PIO and the
> cpuid checks for vCPU lookup seems like a lot of extra noise to sift
> through while debugging where an errant test is failing, and doesn't seem to
> have any disadvantage vs. an array.

Ah, right, I forgot that querying the vCPU ID would require a hypercall.