linux-kernel - Re: [PATCH 0/7] KVM: x86: guest MAXPHYADDR and C-bit fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YNUZqRK3ZMdsNdt4@google.com>
Date:   Thu, 24 Jun 2021 23:47:53 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Tom Lendacky <thomas.lendacky@....com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Peter Gonda <pgonda@...gle.com>,
        Brijesh Singh <brijesh.singh@....com>
Subject: Re: [PATCH 0/7] KVM: x86: guest MAXPHYADDR and C-bit fixes

On Thu, Jun 24, 2021, Sean Christopherson wrote:
> On Thu, Jun 24, 2021, Tom Lendacky wrote:
> > On 6/24/21 12:39 PM, Tom Lendacky wrote:
> > > 
> > > 
> > > On 6/24/21 12:31 PM, Sean Christopherson wrote:
> > >> On Thu, Jun 24, 2021, Tom Lendacky wrote:
> > >>>>
> > >>>> Here's an explanation of the physical address reduction for bare-metal and
> > >>>> guest.
> > >>>>
> > >>>> With MSR 0xC001_0010[SMEE] = 0:
> > >>>>   No reduction in host or guest max physical address.
> > >>>>
> > >>>> With MSR 0xC001_0010[SMEE] = 1:
> > >>>> - Reduction in the host is enumerated by CPUID 0x8000_001F_EBX[11:6],
> > >>>>   regardless of whether SME is enabled in the host or not. So, for example
> > >>>>   on EPYC generation 2 (Rome) you would see a reduction from 48 to 43.
> > >>>> - There is no reduction in physical address in a legacy guest (non-SEV
> > >>>>   guest), so the guest can use a 48-bit physical address
> > >>
> > >> So the behavior I'm seeing is either a CPU bug or user error.  Can you verify
> > >> the unexpected #PF behavior to make sure I'm not doing something stupid?
> > > 
> > > Yeah, I saw that in patch #3. Let me see what I can find out. I could just
> > > be wrong on that myself - it wouldn't be the first time.
> > 
> > From patch #3:
> >   SVM: KVM: CPU #PF @ rip = 0x409ca4, cr2 = 0xc0000000, pfec = 0xb
> >   KVM: guest PTE = 0x181023 @ GPA = 0x180000, level = 4
> >   KVM: guest PTE = 0x186023 @ GPA = 0x181000, level = 3
> >   KVM: guest PTE = 0x187023 @ GPA = 0x186000, level = 2
> >   KVM: guest PTE = 0xffffbffff003 @ GPA = 0x187000, level = 1
> >   SVM: KVM: GPA = 0x7fffbffff000
> > 
> > I think you may be hitting a special HT region that is at the top 12GB of
> > the 48-bit memory range and is reserved, even for GPAs. Can you somehow
> > get the test to use an address below 0xfffd_0000_0000? That would show
> > that bit 47 is valid for the legacy guest while staying out of the HT region.
> 
> I can make that happen.

Ah, hilarious.  That indeed does the trick.  0xfffd00000000 = #PF,
0xfffcfffff000 = good.

I'll send a revert shortly.  There's another C-bit bug that needs fixing, too :-/
The unconditional __sme_clr() in npf_interception() is wrong and breaks non-SEV
guests.  Based on this from the APM

  If the C-bit is an address bit, this bit is masked from the guest
  physical address when it is translated through the nested page tables.
  Consequently, the hypervisor does not need to be aware of which pages
  the guest has chosen to mark private.

I assume it's not needed for SEV either?  I'm about to find out shortly, but if
you happen to know for sure... :-)