linux-kernel - Re: [PATCH] Revert "KVM: x86: WARN and reject loading KVM if NX is supported but not enabled"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210713035944.l7qa7q4qsmqywg6u@linux.intel.com>
Date:   Tue, 13 Jul 2021 11:59:44 +0800
From:   Yu Zhang <yu.c.zhang@...ux.intel.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Revert "KVM: x86: WARN and reject loading KVM if NX is
 supported but not enabled"

On Mon, Jul 12, 2021 at 02:36:53PM +0000, Sean Christopherson wrote:
> On Mon, Jul 12, 2021, Yu Zhang wrote:
> > On Fri, Jul 09, 2021 at 05:21:52PM +0000, Sean Christopherson wrote:
> > > On Thu, Jul 08, 2021, Paolo Bonzini wrote:
> > > > So do we want this or "depends on X86_64 || X86_PAE"?
> > > 
> > > Hmm, I'm leaning towards keeping !PAE support purely for testing the !PAE<->PAE
> > > MMU transitions for nested virtualization.  It's not much coverage, and the !PAE
> > 
> > May I ask what "!PAE<->PAE MMU transition for nested virtualization" means?
> > Running L1 KVM with !PAE and L0 in PAE? I had thought KVM can only function
> > with PAE set(though I did not see any check of CR4 in kvm_arch_init()). Did
> > I miss something?
> 
> When L1 uses shadow paging, L0 KVM's uses a single MMU instance for both L1 and
> L2, and relies on the MMU role to differentiate between L1 and L2.  KVM requires
> PAE for shadow paging, but does not require PAE in the host kernel.  So when L1
> KVM uses shadow paging, it can effectively use !PAE paging for L1 and PAE paging
> for L2.  L0 KVM needs to handle that the !PAE<->PAE transitions when switching
> between L1 and L2, e.g. needs to correctly reinitialize the MMU context.

Hah... Actually, I do have a misunderstanding here. The host does not need to be
PAE. Thanks for the explanation! :)

> 
> > > NPT horror is a much bigger testing gap (because KVM doesn't support it), but on
> > > the other hand setting EFER.NX for !PAE kernels appears to be trivial, e.g.
> > > 
> > > diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> > > index 67f590425d90..bfbea25a9fe8 100644
> > > --- a/arch/x86/kernel/head_32.S
> > > +++ b/arch/x86/kernel/head_32.S
> > > @@ -214,12 +214,6 @@ SYM_FUNC_START(startup_32_smp)
> > >         andl $~1,%edx                   # Ignore CPUID.FPU
> > >         jz .Lenable_paging              # No flags or only CPUID.FPU = no CR4
> > > 
> > > -       movl pa(mmu_cr4_features),%eax
> > > -       movl %eax,%cr4
> > > -
> > > -       testb $X86_CR4_PAE, %al         # check if PAE is enabled
> > > -       jz .Lenable_paging
> > > -
> > >         /* Check if extended functions are implemented */
> > >         movl $0x80000000, %eax
> > >         cpuid
> > > 
> > > My only hesitation is the risk of somehow breaking ancient CPUs by falling into
> > > the NX path.  Maybe try forcing EFER.NX=1 for !PAE, and fall back to requiring
> > > PAE if that gets NAK'd or needs to be reverted for whatever reason?
> > > 
> > 
> > One more dumb question: are you planning to set NX for linux with !PAE?
> 
> Yep.
> 
> > Why do we need EFER in that case? Thanks! :)
> 
> Because as you rightly remembered above, KVM always uses PAE paging for the guest,
> even when the host is !PAE.  And KVM also requires EFER.NX=1 for the guest when
> using shadow paging to handle a potential SMEP and !WP case.  
> 

Just saw this in update_transition_efer(), which now enables efer.nx in shadow
unconditionally. But I guess the host kernel still needs to set efer.nx for
!PAE(e.g. in head_32.S), because the guest may not touch efer at all. Is this
correct?

B.R.
Yu