lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJhGHyApvmQk4bxxK2rJKzyAShFSXyEb2W0qyFcVoUEcsMKs_w@mail.gmail.com>
Date:   Sat, 28 Nov 2020 10:04:01 +0800
From:   Lai Jiangshan <jiangshanlai@...il.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org,
        Lai Jiangshan <laijs@...ux.alibaba.com>,
        Jonathan Corbet <corbet@....net>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        Avi Kivity <avi@...ranet.com>, linux-doc@...r.kernel.org
Subject: Re: [PATCH] kvm/x86/mmu: use the correct inherited permissions to get
 shadow page

On Sat, Nov 28, 2020 at 12:48 AM Paolo Bonzini <pbonzini@...hat.com> wrote:
>
> On 26/11/20 01:05, Sean Christopherson wrote:
> > On Fri, Nov 20, 2020, Lai Jiangshan wrote:
> >> From: Lai Jiangshan <laijs@...ux.alibaba.com>
> >>
> >> Commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated
> >> guest pte updates") said role.access is common access permissions for
> >> all ptes in this shadow page, which is the inherited permissions from
> >> the parent ptes.
> >>
> >> But the commit did not enforce this definition when kvm_mmu_get_page()
> >> is called in FNAME(fetch). Rather, it uses a random (last level pte's
> >> combined) access permissions.
> >
> > I wouldn't say it's random, the issue is specifically that all shadow pages end
> > up using the combined set of permissions of the entire walk, as opposed to the
> > only combined permissions of its parents.
> >
> >> And the permissions won't be checked again in next FNAME(fetch) since the
> >> spte is present. It might fail to meet guest's expectation when guest sets up
> >> spaghetti pagetables.
> >
> > Can you provide details on the exact failure scenario?  It would be very helpful
> > for documentation and understanding.  I can see how using the full combined
> > permissions will cause weirdness for upper level SPs in kvm_mmu_get_page(), but
> > I'm struggling to connect the dots to understand how that will cause incorrect
> > behavior for the guest.  AFAICT, outside of the SP cache, KVM only consumes
> > role.access for the final/last SP.
> >
>
> Agreed, a unit test would be even better, but just a description in the
> commit message would be enough.
>
> Paolo
>

Something in my mind, but I haven't test it:

pgd[]pud[]  pmd[]        pte[]            virtual address pointers
 (same hpa as pmd2\)  /->pte1(u--)->page1 <- ptr1 (u--)
         /->pmd1(uw-)--->pte2(uw-)->page2 <- ptr2 (uw-)
pgd->pud-|           (shared pte[] as above)
         \->pmd2(u--)--->pte1(u--)->page1 <- ptr3 (u--)
 (same hpa as pmd1/)  \->pte2(uw-)->page2 <- ptr4 (u--)


pmd1 and pmd2 point to the same pte table, so:
ptr1 and ptr3 points to the same page.
ptr2 and ptr4 points to the same page.

  The guess read-accesses to ptr1 first. So the hypervisor gets the
shadow pte page table with role.access=u-- among other things.
   (Note the shadowed pmd1's access is uwx)

  And then the guest write-accesses to ptr2, and the hypervisor
set up shadow page for ptr2.
   (Note the hypervisor silencely accepts the role.access=u--
    shadow pte page table in FNAME(fetch))

  After that, the guess read-accesses to ptr3, the hypervisor
reused the same shadow pte page table as above.

  At last, the guest writes to ptr4 without vmexit nor pagefault,
Which should cause vmexit as the guest expects.

In theory, guest userspace can trick the guest kernel if the guest
kernel sets up page table like this.

Such spaghetti pagetables are unlikely to be seen in the guest.

But when the guest is using KPTI and not using SMEP. KPTI means
all pgd entries are marked NX on the lower/userspace part of
the kernel pagetable. Which means SMEP is not needed.
(see arch/x86/mm/pti.c)

Assume the guest does disable SMEP and the guest has the flaw
that the guest user can trick guest kernel to execute on lower
part of the address space.

Normally, NX bit marked on the kernel pagetable's lower pgd
entries can help in this case. But when in guest with shadowpage
in hypervisor, the guest user can make those NX bit useless.

Again, I haven't tested it neither. I will try it later and
update the patch including adding some more checks in the mmu.c.

Thanks,
Lai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ