linux-kernel - Re: [V1 PATCH 1/6] KVM: x86: Add support for testing private memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221202002635.bkhs3h7skd7igtpr@amd.com>
Date:   Thu, 1 Dec 2022 18:26:35 -0600
From:   Michael Roth <michael.roth@....com>
To:     Sean Christopherson <seanjc@...gle.com>
CC:     Chao Peng <chao.p.peng@...ux.intel.com>,
        Vishal Annapurve <vannapurve@...gle.com>, <x86@...nel.org>,
        <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-kselftest@...r.kernel.org>, <pbonzini@...hat.com>,
        <vkuznets@...hat.com>, <wanpengli@...cent.com>,
        <jmattson@...gle.com>, <joro@...tes.org>, <tglx@...utronix.de>,
        <mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>,
        <hpa@...or.com>, <shuah@...nel.org>, <yang.zhong@...el.com>,
        <ricarkol@...gle.com>, <aaronlewis@...gle.com>,
        <wei.w.wang@...el.com>, <kirill.shutemov@...ux.intel.com>,
        <corbet@....net>, <hughd@...gle.com>, <jlayton@...nel.org>,
        <bfields@...ldses.org>, <akpm@...ux-foundation.org>,
        <yu.c.zhang@...ux.intel.com>, <jun.nakajima@...el.com>,
        <dave.hansen@...el.com>, <qperret@...gle.com>,
        <steven.price@....com>, <ak@...ux.intel.com>, <david@...hat.com>,
        <luto@...nel.org>, <vbabka@...e.cz>, <marcorr@...gle.com>,
        <erdemaktas@...gle.com>, <pgonda@...gle.com>, <nikunj@....com>,
        <diviness@...gle.com>, <maz@...nel.org>, <dmatlack@...gle.com>,
        <axelrasmussen@...gle.com>, <maciej.szmigiero@...cle.com>,
        <mizhang@...gle.com>, <bgardon@...gle.com>,
        <ackerleytng@...gle.com>
Subject: Re: [V1 PATCH 1/6] KVM: x86: Add support for testing private memory

On Tue, Nov 22, 2022 at 08:06:01PM +0000, Sean Christopherson wrote:
> On Tue, Nov 22, 2022, Chao Peng wrote:
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 10017a9f26ee..b3118d00b284 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -4280,6 +4280,10 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> > >  
> > >  	fault->gfn = fault->addr >> PAGE_SHIFT;
> > >  	fault->slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn);
> > > +#ifdef CONFIG_HAVE_KVM_PRIVATE_MEM_TESTING
> > > +	fault->is_private = kvm_slot_can_be_private(fault->slot) &&
> > > +			kvm_mem_is_private(vcpu->kvm, fault->gfn);
> > > +#endif
> > >  
> > >  	if (page_fault_handle_page_track(vcpu, fault))
> > >  		return RET_PF_EMULATE;
> > > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> > > index 5cdff5ca546c..2e759f39c2c5 100644
> > > --- a/arch/x86/kvm/mmu/mmu_internal.h
> > > +++ b/arch/x86/kvm/mmu/mmu_internal.h
> > > @@ -188,7 +188,6 @@ struct kvm_page_fault {
> > >  
> > >  	/* Derived from mmu and global state.  */
> > >  	const bool is_tdp;
> > > -	const bool is_private;
> > >  	const bool nx_huge_page_workaround_enabled;
> > >  
> > >  	/*
> > > @@ -221,6 +220,9 @@ struct kvm_page_fault {
> > >  	/* The memslot containing gfn. May be NULL. */
> > >  	struct kvm_memory_slot *slot;
> > >  
> > > +	/* Derived from encryption bits of the faulting GPA for CVMs. */
> > > +	bool is_private;
> > 
> > Either we can wrap it with the CONFIG_HAVE_KVM_PRIVATE_MEM_TESTING or if
> > it looks ugly I can remove the "const" in my code.
> 
> Hmm, I think we can keep the const.  Similar to the bug in kvm_faultin_pfn()[*],
> the kvm_slot_can_be_private() is bogus.  A fault should be considered private if
> it's marked as private, whether or not userspace has configured the slot to be
> private is irrelevant.  I.e. the xarray is the single source of truth, memslots
> are just plumbing.

I've been looking at pulling this series into our SNP+UPM patchset (and
replacing the UPM selftests that were including with UPMv9). We ended up
with something similar to what you've suggested, but instead of calling
kvm_mem_is_private() directly we added a wrapper in mmu_internal.h that's
called via:

kvm_mmu_do_page_fault():
  struct kvm_page_fault fault = {
    ...
    .is_private = kvm_mmu_fault_is_private()

where kvm_mmu_fault_is_private() is defined something like:

static bool kvm_mmu_fault_is_private(struct kvm *kvm, gpa_t gpa, u64 err)
{
        struct kvm_memory_slot *slot;
        gfn_t gfn = gpa_to_gfn(gpa);
        bool private_fault = false;

        slot = gfn_to_memslot(kvm, gpa_to_gfn(gpa));
        if (!slot)
                goto out;

        if (!kvm_slot_can_be_private(slot))
                goto out;

		/* If platform hook returns 1 then use it's determination of private_fault */
        if (static_call(kvm_x86_fault_is_private)(kvm, gpa, err, &private_fault) == 1)
                goto out;

        /*
         * Handling below is for guests that rely on the VMM to control when a fault
         * should be treated as private or not via KVM_MEM_ENCRYPT_{REG,UNREG}_REGION.
         * This is mainly for the KVM self-tests for restricted memory.
         */
#ifdef CONFIG_HAVE_KVM_PRIVATE_MEM_TESTING
        private_fault = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa);
#endif

out:
        return private_fault;
}

I tried removing kvm_slot_can_be_private() based on your comments, but
we ended up hitting a crash in restrictedmem_get_page(). I think this is
because the xarray currently defaults to 'private', so when KVM MMU relies
only on xarray it can hit cases where it thinks a GPA should be backed
by a restricted page, but when it calls kvm_restrictedmem_get_pfn() a
null slot->restricted_file gets passed to restricted_get_page() and it
blows up.

I know Chao mentioned they were considering switching to 'shared' as the
default xarray value, which might fix this issue, but until then we've
left these checks in place.

Just figured I'd mention this in case Vishal hits similar issues.

-Mike

> 
> Then kvm_mmu_do_page_fault() can do something like:
> 
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index dbaf6755c5a7..456a9daa36e5 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -260,6 +260,8 @@ enum {
>  static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>                                         u32 err, bool prefetch)
>  {
> +       bool is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault);
> +
>         struct kvm_page_fault fault = {
>                 .addr = cr2_or_gpa,
>                 .error_code = err,
> @@ -269,13 +271,15 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>                 .rsvd = err & PFERR_RSVD_MASK,
>                 .user = err & PFERR_USER_MASK,
>                 .prefetch = prefetch,
> -               .is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault),
> +               .is_tdp = is_tdp,
>                 .nx_huge_page_workaround_enabled =
>                         is_nx_huge_page_enabled(vcpu->kvm),
>  
>                 .max_level = KVM_MAX_HUGEPAGE_LEVEL,
>                 .req_level = PG_LEVEL_4K,
>                 .goal_level = PG_LEVEL_4K,
> +               .private = IS_ENABLED(CONFIG_HAVE_KVM_PRIVATE_MEM_TESTING) && is_tdp &&
> +                          kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT),
>         };
>         int r;
> 
> [*] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FY3Vgc5KrNRA8r6vh%40google.com&amp;data=05%7C01%7CMichael.Roth%40amd.com%7Cc65b2b9b200e41f189ff08daccc4ffdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047443786540517%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Oajn46ulTFXBh0nIx61YmbbMAqW64EqKRniZJwLfXLs%3D&amp;reserved=0