lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YEaE9K/dAjImXv0f@google.com>
Date:   Mon, 8 Mar 2021 12:11:32 -0800
From:   Sean Christopherson <seanjc@...gle.com>
To:     Tom Lendacky <thomas.lendacky@....com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Ben Gardon <bgardon@...gle.com>
Subject: Re: [PATCH 20/24] KVM: x86/mmu: Use a dedicated bit to track
 shadow/MMU-present SPTEs

On Mon, Mar 08, 2021, Tom Lendacky wrote:
> On 2/25/21 2:47 PM, Sean Christopherson wrote:
> > Introduce MMU_PRESENT to explicitly track which SPTEs are "present" from
> > the MMU's perspective.  Checking for shadow-present SPTEs is a very
> > common operation for the MMU, particularly in hot paths such as page
> > faults.  With the addition of "removed" SPTEs for the TDP MMU,
> > identifying shadow-present SPTEs is quite costly especially since it
> > requires checking multiple 64-bit values.
> > 
> > On 64-bit KVM, this reduces the footprint of kvm.ko's .text by ~2k bytes.
> > On 32-bit KVM, this increases the footprint by ~200 bytes, but only
> > because gcc now inlines several more MMU helpers, e.g. drop_parent_pte().
> > 
> > Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> > ---
> >   arch/x86/kvm/mmu/spte.c |  8 ++++----
> >   arch/x86/kvm/mmu/spte.h | 11 ++++++++++-
> >   2 files changed, 14 insertions(+), 5 deletions(-)
> 
> I'm trying to run a guest on my Rome system using the queue branch, but
> I'm encountering an error that I bisected to this commit. In the guest
> (during OVMF boot) I see:
> 
> error: kvm run failed Invalid argument
> RAX=0000000000000000 RBX=00000000ffc12792 RCX=000000007f58401a RDX=000000007faaf808
> RSI=0000000000000010 RDI=00000000ffc12792 RBP=00000000ffc12792 RSP=000000007faaf740
> R8 =0000000000000792 R9 =000000007faaf808 R10=00000000ffc12793 R11=00000000000003f8
> R12=0000000000000010 R13=0000000000000000 R14=000000007faaf808 R15=0000000000000012
> RIP=000000007f6e9a90 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
> SS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> DS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> FS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> GS =0030 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
> GDT=     000000007f5ee698 00000047
> IDT=     000000007f186018 00000fff
> CR0=80010033 CR2=0000000000000000 CR3=000000007f801000 CR4=00000668
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000d00
> Code=22 00 00 e8 c0 e6 ff ff 48 83 c4 20 45 84 ed 74 07 fb eb 04 <44> 88 65 00 58 5b 5d 41 5c 41 5d c3 55 48 0f af 3d 1b 37 00 00 be 20 00 00 00 48 03 3d 17
> 
> On the hypervisor, I see the following:
> 
> [   55.886136] get_mmio_spte: detect reserved bits on spte, addr 0xffc12792, dump hierarchy:
> [   55.895284] ------ spte 0x1344a0827 level 4.
> [   55.900059] ------ spte 0x134499827 level 3.
> [   55.904877] ------ spte 0x165bf0827 level 2.
> [   55.909651] ------ spte 0xff800ffc12817 level 1.

Ah fudge.  I know what's wrong.  The MMIO generation uses bit 11, which means
is_shadow_present_pte() can get a false positive on high MMIO generations.  This
particular error can be squashed by explicitly checking for MMIO sptes in
get_mmio_spte(), but I'm guessing there are other flows where a false positive
is fatal (probably the __pte_list_remove bug below...).  The safe thing to do is
to steal bit 11 from MMIO SPTEs so that shadow present PTEs are the only thing
that sets that bit.

I'll reproduce by stuffing the MMIO generation and get a patch posted.  Sorry :-/

> When I kill the guest, I get a kernel panic:
> 
> [   95.539683] __pte_list_remove: 0000000040567a6a 0->BUG
> [   95.545481] kernel BUG at arch/x86/kvm/mmu/mmu.c:896!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ