[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bc7aea45-f254-4cbc-8dc0-5435417d8577@intel.com>
Date: Thu, 26 Jun 2025 16:59:00 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>, Marc Zyngier <maz@...nel.org>,
Oliver Upton <oliver.upton@...ux.dev>
Cc: kvm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
kvmarm@...ts.linux.dev, linux-kernel@...r.kernel.org,
Jim Mattson <jmattson@...gle.com>
Subject: Re: [PATCH v5 2/5] KVM: x86: Provide a capability to disable
APERF/MPERF read intercepts
On 6/26/2025 8:12 AM, Sean Christopherson wrote:
> From: Jim Mattson <jmattson@...gle.com>
>
> Allow a guest to read the physical IA32_APERF and IA32_MPERF MSRs
> without interception.
>
> The IA32_APERF and IA32_MPERF MSRs are not virtualized. Writes are not
> handled at all. The MSR values are not zeroed on vCPU creation, saved
> on suspend, or restored on resume. No accommodation is made for
> processor migration or for sharing a logical processor with other
> tasks. No adjustments are made for non-unit TSC multipliers. The MSRs
> do not account for time the same way as the comparable PMU events,
> whether the PMU is virtualized by the traditional emulation method or
> the new mediated pass-through approach.
>
> Nonetheless, in a properly constrained environment, this capability
> can be combined with a guest CPUID table that advertises support for
> CPUID.6:ECX.APERFMPERF[bit 0] to induce a Linux guest to report the
> effective physical CPU frequency in /proc/cpuinfo. Moreover, there is
> no performance cost for this capability.
>
> Signed-off-by: Jim Mattson <jmattson@...gle.com>
> Link: https://lore.kernel.org/r/20250530185239.2335185-3-jmattson@google.com
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> Documentation/virt/kvm/api.rst | 23 +++++++++++++++++++++++
> arch/x86/kvm/svm/nested.c | 4 +++-
> arch/x86/kvm/svm/svm.c | 5 +++++
> arch/x86/kvm/vmx/nested.c | 6 ++++++
> arch/x86/kvm/vmx/vmx.c | 4 ++++
> arch/x86/kvm/x86.c | 6 +++++-
> arch/x86/kvm/x86.h | 5 +++++
> include/uapi/linux/kvm.h | 1 +
> tools/include/uapi/linux/kvm.h | 1 +
> 9 files changed, 53 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 43ed57e048a8..27ced3ee2b53 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7844,6 +7844,7 @@ Valid bits in args[0] are::
> #define KVM_X86_DISABLE_EXITS_HLT (1 << 1)
> #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2)
> #define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3)
> + #define KVM_X86_DISABLE_EXITS_APERFMPERF (1 << 4)
>
> Enabling this capability on a VM provides userspace with a way to no
> longer intercept some instructions for improved latency in some
> @@ -7854,6 +7855,28 @@ all such vmexits.
>
> Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
>
> +Virtualizing the ``IA32_APERF`` and ``IA32_MPERF`` MSRs requires more
> +than just disabling APERF/MPERF exits. While both Intel and AMD
> +document strict usage conditions for these MSRs--emphasizing that only
> +the ratio of their deltas over a time interval (T0 to T1) is
> +architecturally defined--simply passing through the MSRs can still
> +produce an incorrect ratio.
> +
> +This erroneous ratio can occur if, between T0 and T1:
> +
> +1. The vCPU thread migrates between logical processors.
> +2. Live migration or suspend/resume operations take place.
> +3. Another task shares the vCPU's logical processor.
> +4. C-states lower thean C0 are emulated (e.g., via HLT interception).
s/thean/than/
Reviewed-by: Xiaoyao Li <xiaoyao.li@...el.com>
Powered by blists - more mailing lists