[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200710154811.418214-10-mgamal@redhat.com>
Date: Fri, 10 Jul 2020 17:48:11 +0200
From: Mohammed Gamal <mgamal@...hat.com>
To: kvm@...r.kernel.org, pbonzini@...hat.com
Cc: linux-kernel@...r.kernel.org, vkuznets@...hat.com,
sean.j.christopherson@...el.com, wanpengli@...cent.com,
jmattson@...gle.com, joro@...tes.org,
Mohammed Gamal <mgamal@...hat.com>,
Tom Lendacky <thomas.lendacky@....com>,
Babu Moger <babu.moger@....com>
Subject: [PATCH v3 9/9] KVM: x86: SVM: VMX: Make GUEST_MAXPHYADDR < HOST_MAXPHYADDR support configurable
The reason behind including this patch is unexpected behaviour we see
with NPT vmexit handling in AMD processor.
With previous patch ("KVM: SVM: Add guest physical address check in
NPF/PF interception") we see the followning error multiple times in
the 'access' test in kvm-unit-tests:
test pte.p pte.36 pde.p: FAIL: pte 2000021 expected 2000001
Dump mapping: address: 0x123400000000
------L4: 24c3027
------L3: 24c4027
------L2: 24c5021
------L1: 1002000021
This shows that the PTE's accessed bit is apparently being set by
the CPU hardware before the NPF vmexit. This completely handled by
hardware and can not be fixed in software.
This patch introduces a workaround. We add a boolean variable:
'allow_smaller_maxphyaddr'
Which is set individually by VMX and SVM init routines. On VMX it's
always set to true, on SVM it's only set to true when NPT is not
enabled.
We also add a new capability KVM_CAP_SMALLER_MAXPHYADDR which
allows userspace to query if the underlying architecture would
support GUEST_MAXPHYADDR < HOST_MAXPHYADDR and hence act accordingly
(e.g. qemu can decide if it would ignore the -cpu ..,phys-bits=X)
CC: Tom Lendacky <thomas.lendacky@....com>
CC: Babu Moger <babu.moger@....com>
Signed-off-by: Mohammed Gamal <mgamal@...hat.com>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/svm/svm.c | 15 +++++++++++++++
arch/x86/kvm/vmx/vmx.c | 7 +++++++
arch/x86/kvm/x86.c | 6 ++++++
include/uapi/linux/kvm.h | 1 +
5 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bb4044ffb7b7..26002e1b47f7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1304,7 +1304,7 @@ struct kvm_arch_async_pf {
};
extern u64 __read_mostly host_efer;
-
+extern bool __read_mostly allow_smaller_maxphyaddr;
extern struct kvm_x86_ops kvm_x86_ops;
#define __KVM_HAVE_ARCH_VM_ALLOC
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 79c33b3539f0..f3d7ae26875c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -924,6 +924,21 @@ static __init int svm_hardware_setup(void)
svm_set_cpu_caps();
+ /*
+ * It seems that on AMD processors PTE's accessed bit is
+ * being set by the CPU hardware before the NPF vmexit.
+ * This is not expected behaviour and our tests fail because
+ * of it.
+ * A workaround here is to disable support for
+ * GUEST_MAXPHYADDR < HOST_MAXPHYADDR if NPT is enabled.
+ * In this case userspace can know if there is support using
+ * KVM_CAP_SMALLER_MAXPHYADDR extension and decide how to handle
+ * it
+ * If future AMD CPU models change the behaviour described above,
+ * this variable can be changed accordingly
+ */
+ allow_smaller_maxphyaddr = !npt_enabled;
+
return 0;
err:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0cebc4832805..8a8e85e6c529 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8294,6 +8294,13 @@ static int __init vmx_init(void)
#endif
vmx_check_vmcs12_offsets();
+ /*
+ * Intel processors don't have problems with
+ * GUEST_MAXPHYADDR < HOST_MAXPHYADDR so enable
+ * it for VMX by default
+ */
+ allow_smaller_maxphyaddr = true;
+
return 0;
}
module_init(vmx_init);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 03c401963062..167becd6a634 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -187,6 +187,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
u64 __read_mostly host_efer;
EXPORT_SYMBOL_GPL(host_efer);
+bool __read_mostly allow_smaller_maxphyaddr;
+EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr);
+
static u64 __read_mostly host_xss;
u64 __read_mostly supported_xss;
EXPORT_SYMBOL_GPL(supported_xss);
@@ -3538,6 +3541,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
r = kvm_x86_ops.nested_ops->enable_evmcs != NULL;
break;
+ case KVM_CAP_SMALLER_MAXPHYADDR:
+ r = (int) allow_smaller_maxphyaddr;
+ break;
default:
break;
}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4fdf30316582..68cd3a0af9bb 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1031,6 +1031,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_PPC_SECURE_GUEST 181
#define KVM_CAP_HALT_POLL 182
#define KVM_CAP_ASYNC_PF_INT 183
+#define KVM_CAP_SMALLER_MAXPHYADDR 184
#ifdef KVM_CAP_IRQ_ROUTING
--
2.26.2
Powered by blists - more mailing lists