[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5be999eb-64d7-de0e-254b-82711acc5e24@intel.com>
Date: Fri, 5 Mar 2021 10:33:24 +0800
From: "Xu, Like" <like.xu@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Kan Liang <kan.liang@...ux.intel.com>,
Dave Hansen <dave.hansen@...el.com>, wei.w.wang@...el.com,
Borislav Petkov <bp@...en8.de>, kvm@...r.kernel.org,
x86@...nel.org, linux-kernel@...r.kernel.org,
Like Xu <like.xu@...ux.intel.com>
Subject: Re: [PATCH v3 5/9] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for
Arch LBR
On 2021/3/5 0:12, Sean Christopherson wrote:
> On Thu, Mar 04, 2021, Xu, Like wrote:
>> Hi Sean,
>>
>> Thanks for your detailed review on the patch set.
>>
>> On 2021/3/4 0:58, Sean Christopherson wrote:
>>> On Wed, Mar 03, 2021, Like Xu wrote:
>>>> @@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
>>>> return true;
>>>> }
>>>> +/*
>>>> + * Check if the requested depth values is supported
>>>> + * based on the bits [0:7] of the guest cpuid.1c.eax.
>>>> + */
>>>> +static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
>>>> +{
>>>> + struct kvm_cpuid_entry2 *best;
>>>> +
>>>> + best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
>>>> + if (best && depth && !(depth % 8))
>>> This is still wrong, it fails to weed out depth > 64.
>> How come ? The testcases depth = {65, 127, 128} get #GP as expected.
> @depth is a u64, throw in a number that is a multiple of 8 and >= 520, and the
> "(1ULL << (depth / 8 - 1))" will trigger undefined behavior due to shifting
> beyond the capacity of a ULL / u64.
Extra:
when we say "undefined behavior" if shifting beyond the capacity of a ULL,
do you mean that the actual behavior depends on the machine, architecture
or compiler?
>
> Adding the "< 64" check would also allow dropping the " & 0xff" since the check
> would ensure the shift doesn't go beyond bit 7. I'm not sure the cleverness is
> worth shaving a cycle, though.
Finally how about:
if (best && depth && (depth < 65) && !(depth & 7))
return best->eax & BIT_ULL(depth / 8 - 1);
return false;
Do you see the room for optimization ?
>
>>> Not that this is a hot path, but it's probably worth double checking that the
>>> compiler generates simple code for "depth % 8", e.g. it can be "depth & 7)".
>> Emm, the "%" operation is quite normal over kernel code.
> So is "&" :-) I was just pointing out that the compiler should optimize this,
> and it did.
>
>> if (best && depth && !(depth % 8))
>> 10659: 48 85 c0 test rax,rax
>> 1065c: 74 c7 je 10625 <intel_pmu_set_msr+0x65>
>> 1065e: 4d 85 e4 test r12,r12
>> 10661: 74 c2 je 10625 <intel_pmu_set_msr+0x65>
>> 10663: 41 f6 c4 07 test r12b,0x7
>> 10667: 75 bc jne 10625 <intel_pmu_set_msr+0x65>
>>
>> It looks like the compiler does the right thing.
>> Do you see the room for optimization ?
>>
>>>> + return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));
> Actually, looking at this again, I would explicitly use BIT() instead of 1ULL
> (or BIT_ULL), since the shift must be 7 or less.
>
>>>> +
>>>> + return false;
>>>> +}
>>>> +
Powered by blists - more mailing lists