lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 May 2022 13:54:19 -0700
From:   David Matlack <dmatlack@...gle.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm list <kvm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ben Gardon <bgardon@...gle.com>
Subject: Re: [PATCH 1/2] KVM: x86/mmu: Drop RWX=0 SPTEs during ept_sync_page()

On Fri, May 13, 2022 at 12:50 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> Drop SPTEs whose new protections will yield a RWX=0 SPTE, i.e. a SPTE
> that is marked shadow-present but is not-present in the page tables.  If
> EPT with execute-only support is in use by L1, KVM can create a RWX=0
> SPTE can be created for an EPTE if the upper level combined permissions
> are R (or RW) and the leaf EPTE is changed from R (or RW) to X.

For some reason I found this sentence hard to read. What about this:

  When shadowing EPT and NX HugePages is enabled, if the guest changes
the permissions on a huge page in the EPT12 to be execute-only, KVM
will end shadowing it with an RWX=0 SPTE in the EPT02 when it picks up
the change in FNAME(sync_page). Note that the guest can't induce KVM
to create a RWX=0 during FNAME(fetch), since the only valid way for
the guest to fault in an execute-only huge page is with an instruction
fetch, which KVM will handle by mapping the page as an executable 4KiB
page.

> Because
> the EPTE is considered present when viewed in isolation, and no reserved
> bits are set, FNAME(prefetch_invalid_gpte) will consider the GPTE valid.
>
> Creating a not-present SPTE isn't fatal as the SPTE is "correct" in the
> sense that the guest translation is inaccesible (the combined protections
> of all levels yield RWX=0), i.e. the guest won't get stuck in an infinite
> loop.  If EPT A/D bits are disabled, KVM can mistake the SPTE for an
> access-tracked SPTE.  But again, such confusion isn't fatal as the "saved"
> protections are also RWX=0.
>
> Add a WARN in make_spte() to detect creation of SPTEs that will result in
> RWX=0 protections, which is the real motivation for fixing ept_sync_page().
> Creating a useless SPTE means KVM messed up _something_, even if whatever
> goof occurred doesn't manifest as a functional bug.
>
> Fixes: d95c55687e11 ("kvm: mmu: track read permission explicitly for shadow EPT page tables")
> Cc: David Matlack <dmatlack@...gle.com>
> Cc: Ben Gardon <bgardon@...gle.com>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/kvm/mmu/paging_tmpl.h | 9 ++++++++-
>  arch/x86/kvm/mmu/spte.c        | 2 ++
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index b025decf610d..d9f98f9ed4a0 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -1052,7 +1052,14 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
>                 if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access))
>                         continue;
>
> -               if (gfn != sp->gfns[i]) {
> +               /*
> +                * Drop the SPTE if the new protections would result in a RWX=0
> +                * SPTE or if the gfn is changing.  The RWX=0 case only affects
> +                * EPT with execute-only support, i.e. EPT without an effective
> +                * "present" bit, as all other paging modes will create a
> +                * read-only SPTE if pte_access is zero.
> +                */
> +               if ((!pte_access && !shadow_present_mask) || gfn != sp->gfns[i]) {
>                         drop_spte(vcpu->kvm, &sp->spt[i]);
>                         flush = true;
>                         continue;
> diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> index 75c9e87d446a..9ad60662beac 100644
> --- a/arch/x86/kvm/mmu/spte.c
> +++ b/arch/x86/kvm/mmu/spte.c
> @@ -101,6 +101,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
>         u64 spte = SPTE_MMU_PRESENT_MASK;
>         bool wrprot = false;
>
> +       WARN_ON_ONCE(!pte_access && !shadow_present_mask);
> +
>         if (sp->role.ad_disabled)
>                 spte |= SPTE_TDP_AD_DISABLED_MASK;
>         else if (kvm_mmu_page_ad_need_write_protect(sp))
> --
> 2.36.0.550.gb090851708-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ