[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YBB0AT6xfObR7A5l@google.com>
Date: Tue, 26 Jan 2021 11:56:49 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Jim Mattson <jmattson@...gle.com>,
Chenyi Qiang <chenyi.qiang@...el.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Joerg Roedel <joro@...tes.org>,
Xiaoyao Li <xiaoyao.li@...el.com>,
kvm list <kvm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC 6/7] KVM: X86: Expose PKS to guest and userspace
On Tue, Jan 26, 2021, Paolo Bonzini wrote:
> On 30/09/20 06:36, Sean Christopherson wrote:
> > > CR4.PKS is not in the list of CR4 bits that result in a PDPTE load.
> > > Since it has no effect on PAE paging, I would be surprised if it did
> > > result in a PDPTE load.
> > It does belong in the mmu_role_bits though;-)
> >
>
> Does it? We don't support PKU/PKS for shadow paging, and it's always zero
> for EPT. We only support enough PKU/PKS for emulation.
As proposed, yes. The PKU/PKS mask is tracked on a per-mmu basis, e.g. computed
in update_pkr_bitmask() and consumed in permission_fault() during emulation.
Omitting CR4.PKS from the extended role could let KVM reuse an MMU with the
wrong pkr_mask.
That being said, I don't think it needs a dedicated bit. IIUC, the logic is
PKU|PKS, with pkr_mask generation being PKU vs. PKS agnostic. The role could do
the same and smush the bits together, e.g.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3d6616f6f6ef..3bfca34f6ea2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -293,7 +293,7 @@ union kvm_mmu_extended_role {
unsigned int cr0_pg:1;
unsigned int cr4_pae:1;
unsigned int cr4_pse:1;
- unsigned int cr4_pke:1;
+ unsigned int cr4_pkr:1;
unsigned int cr4_smap:1;
unsigned int cr4_smep:1;
unsigned int maxphyaddr:6;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 79166288ed03..2774b85a36d5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4448,7 +4448,8 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu)
ext.cr4_smep = !!kvm_read_cr4_bits(vcpu, X86_CR4_SMEP);
ext.cr4_smap = !!kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
ext.cr4_pse = !!is_pse(vcpu);
- ext.cr4_pke = !!kvm_read_cr4_bits(vcpu, X86_CR4_PKE);
+ ext.cr4_pkr = !!kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
+ !!kvm_read_cr4_bits(vcpu, X86_CR4_PKS);
ext.maxphyaddr = cpuid_maxphyaddr(vcpu);
ext.valid = 1;
Another option would be to move the tracking out of the MMU, e.g. make pkr_mask
per-vCPU and recalculate when CR4 changes. I think that would "just work", even
when nested VMs are in play?
Powered by blists - more mailing lists