[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7d8ea1b-7c43-4807-b589-092272c5679a@gmail.com>
Date: Wed, 6 Mar 2024 11:23:05 +0800
From: Like Xu <like.xu.linux@...il.com>
To: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>,
Sean Christopherson <seanjc@...gle.com>
Cc: Sandipan Das <sandipan.das@....com>, pbonzini@...hat.com,
mizhang@...gle.com, jmattson@...gle.com, ravi.bangoria@....com,
nikunj.dadhania@....com, santosh.shukla@....com, manali.shukla@....com,
babu.moger@....com, kvm list <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: x86/svm/pmu: Set PerfMonV2 global control bits
correctly
On 6/3/2024 10:32 am, Mi, Dapeng wrote:
>
> On 3/6/2024 2:04 AM, Sean Christopherson wrote:
>> On Tue, Mar 05, 2024, Sean Christopherson wrote:
>>> On Tue, Mar 05, 2024, Like Xu wrote:
>>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>>> index 87cc6c8809ad..f61ce26aeb90 100644
>>> --- a/arch/x86/kvm/pmu.c
>>> +++ b/arch/x86/kvm/pmu.c
>>> @@ -741,6 +741,8 @@ static void kvm_pmu_reset(struct kvm_vcpu *vcpu)
>>> */
>>> void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
>>> {
>>> + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>>> +
>>> if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
>>> return;
>>> @@ -750,8 +752,18 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
>>> */
>>> kvm_pmu_reset(vcpu);
>>> - bitmap_zero(vcpu_to_pmu(vcpu)->all_valid_pmc_idx, X86_PMC_IDX_MAX);
>>> + bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
>>> static_call(kvm_x86_pmu_refresh)(vcpu);
>>> +
>>> + /*
>>> + * At RESET, both Intel and AMD CPUs set all enable bits for general
>>> + * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
>>> + * was written for v1 PMUs don't unknowingly leave GP counters disabled
>>> + * in the global controls). Emulate that behavior when refreshing the
>>> + * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
>>> + */
>>> + if (kvm_pmu_has_perf_global_ctrl(pmu))
>>> + pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
>>> }
>> Doh, this is based on kvm/kvm-uapi, I'll rebase to kvm-x86/next before posting.
>>
>> I'll also update the changelog to call out that KVM has always clobbered
>> global_ctrl
>> during PMU refresh, i.e. there is no danger of breaking existing setups by
>> clobbering a value set by userspace, e.g. during live migration.
>>
>> Lastly, I'll also update the changelog to call out that KVM *did* actually set
>> the general purpose counter enable bits in global_ctrl at "RESET" until v6.0,
>> and that KVM intentionally removed that behavior because of what appears to be
>> an Intel SDM bug.
>>
>> Of course, in typical KVM fashion, that old code was also broken in its own way
>> (the history of this code is a comedy of errors). Initial vPMU support in commit
>> f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests") *almost*
>> got it right, but for some reason only set the bits if the guest PMU was
>> advertised as v1:
>>
>> if (pmu->version == 1) {
>> pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
>> return;
>> }
>>
>>
>> Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be enabled on
>> reset") then tried to remedy that goof, but botched things and also enabled the
>> fixed counters:
>>
>> pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
>> (((1ull << pmu->nr_arch_fixed_counters) - 1) <<
>> X86_PMC_IDX_FIXED);
>> pmu->global_ctrl_mask = ~pmu->global_ctrl;
>>
>> Which was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu: Don't
>> overwrite
>> the pmu->global_ctrl when refreshing") incorrectly removed *everything*. Very
>> ironically, that commit came from Like.
>>
>> Author: Like Xu <likexu@...cent.com>
>> Date: Tue May 10 12:44:07 2022 +0800
>>
>> KVM: x86/pmu: Don't overwrite the pmu->global_ctrl when refreshing
>> Assigning a value to pmu->global_ctrl just to set the value of
>> pmu->global_ctrl_mask is more readable but does not conform to the
>> specification. The value is reset to zero on Power up and Reset but
>> stays unchanged on INIT, like most other MSRs.
>>
>> But wait, it gets even better. Like wasn't making up that behavior, Intel's SDM
>> circa December 2022 states that "Global Perf Counter Controls" is '0' at Power-Up
>> and RESET. But then the March 2023 SDM rolls out and says
>>
>> IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
>>
>> So presumably someone at Intel noticed that what their CPUs do and what the
>> documentation says didn't match.
This reminds me quite a bit of the past, where this kind of thing happened
occasionally (e.g. some pmu events don't count as expected, and ucode
updates often sneak in to change hardware behaviour). Sometimes we have
to rely on hardware behaviour, sometimes we have to trust the documentation
specification, my experience has been to find a balance that is more
favourable to the end-user and to force the hardware vendors to make
more careful documentation and implementation. :D
>
> It's a really long story... thanks for figuring it out.
>
>>
>> *sigh* >>
>
Powered by blists - more mailing lists