[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871434fe-ae80-bec6-9920-a6411f5842c0@gmail.com>
Date: Tue, 28 Mar 2023 17:16:08 +0800
From: Like Xu <like.xu.linux@...il.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Sean Christopherson <seanjc@...gle.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] KVM: x86/pmu: Fix emulation on Intel counters' bit
width
On 27/3/2023 10:30 pm, Paolo Bonzini wrote:
> On Wed, Mar 22, 2023 at 10:31 AM Like Xu <like.xu.linux@...il.com> wrote:
>>
>> From: Like Xu <likexu@...cent.com>
>>
>> Per Intel SDM, the bit width of a PMU counter is specified via CPUID
>> only if the vCPU has FW_WRITE[bit 13] on IA32_PERF_CAPABILITIES.
>> When the FW_WRITE bit is not set, only EAX is valid and out-of-bounds
>> bits accesses do not generate #GP. Conversely when this bit is set, #GP
>> for out-of-bounds bits accesses will also appear on the fixed counters.
>> vPMU currently does not support emulation of bit widths lower than 32
>> bits or higher than its host capability.
>
> Can you please point out the date and paragraph of the SDM?
>
> Paolo
>
25462-078US, December 2022
20.2.6 Full-Width Writes to Performance Counter Registers
The general-purpose performance counter registers IA32_PMCx are writable via
WRMSR instruction.
However, the value written into IA32_PMCx by WRMSR is the signed extended 64-bit
value of the
EAX[31:0] input of WRMSR.
A processor that supports full-width writes to the general-purpose performance
counters enumerated by
CPUID.0AH:EAX[15:8] will set IA32_PERF_CAPABILITIES[13] to enumerate its
full-width-write
capability See Figure 20-65.
If IA32_PERF_CAPABILITIES.FW_WRITE[bit 13] =1, each IA32_PMCi is accompanied by a
corresponding alias address starting at 4C1H for IA32_A_PMC0.
The bit width of the performance monitoring counters is specified in
CPUID.0AH:EAX[23:16].
If IA32_A_PMCi is present, the 64-bit input value (EDX:EAX) of WRMSR to
IA32_A_PMCi will cause
IA32_PMCi to be updated by:
COUNTERWIDTH =
CPUID.0AH:EAX[23:16] bit width of the performance monitoring counter
IA32_PMCi[COUNTERWIDTH-1:32] := EDX[COUNTERWIDTH-33:0]);
IA32_PMCi[31:0] := EAX[31:0];
EDX[63:COUNTERWIDTH] are reserved
---
Some might argue that this is all talking about GP counters, not fixed counters.
In fact, the full-width write hw behaviour is presumed to do the same thing for
all counters.
Commercial hardware will not use less than 32 bits or a bit width like 46 bits.
A KVM user space (such as selftests) may set a strange bit-width, for example
using 33 bits,
and based on the current code, writing the reserved bits for #fixed counters
doesn't cause #GP.
Also when the guest does not have the Full-Width feature, the fixed counters can
be more than
32 bits wide via CPUID, while the #GP counter is only 32 bits wide, which is
also monstrous.
The current KVM is also not capable of emulating counter overflow when KVM user
space is set
to a bit width of less than 32 bits w/ FW_WRITE.
The above SDM-undefined behaviour led to this fix, which may lift some of the fog.
Powered by blists - more mailing lists