linux-kernel - Re: [PATCH 1/4] KVM: x86/pmu: Force reprogramming of all counters on PMU filter change

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8007659a-6c6c-2c5e-f500-652ed31448fb@gmail.com>
Date:   Fri, 14 Oct 2022 14:41:54 +0800
From:   Like Xu <like.xu.linux@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Aaron Lewis <aaronlewis@...gle.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 1/4] KVM: x86/pmu: Force reprogramming of all counters on
 PMU filter change

On 14/10/2022 4:53 am, Sean Christopherson wrote:
> On Thu, Oct 13, 2022, Like Xu wrote:
>> Firstly, thanks for your comments that spewed out around vpmu.
>>
>> On 23/9/2022 8:13 am, Sean Christopherson wrote:
>>> Force vCPUs to reprogram all counters on a PMU filter change to provide
>>> a sane ABI for userspace.  Use the existing KVM_REQ_PMU to do the
>>> programming, and take advantage of the fact that the reprogram_pmi bitmap
>>> fits in a u64 to set all bits in a single atomic update.  Note, setting
>>> the bitmap and making the request needs to be done _after_ the SRCU
>>> synchronization to ensure that vCPUs will reprogram using the new filter.
>>>
>>> KVM's current "lazy" approach is confusing and non-deterministic.  It's
>>
>> The resolute lazy approach was introduced in patch 03, right after this change.
> 
> This is referring to the lazy recognition of the filter, not the deferred
> reprogramming of the counters.  Regardless of whether reprogramming is handled
> via request or in-line, KVM is still lazily recognizing the new filter as vCPUs
> won't picke up the new filter until the _guest_ triggers a refresh.

It may still be too late for the pmu filter to take effect. To eliminate this 
"non-deterministic",
should we kick out all vpmu-enabled vcpus right after making KVM_REQ_PMU requests ?

> 
>>> @@ -613,9 +615,18 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp)
>>>    	mutex_lock(&kvm->lock);
>>>    	filter = rcu_replace_pointer(kvm->arch.pmu_event_filter, filter,
>>>    				     mutex_is_locked(&kvm->lock));
>>> -	mutex_unlock(&kvm->lock);
>>> -
>>>    	synchronize_srcu_expedited(&kvm->srcu);
>>
>> The relative order of these two operations has been reversed
>> 	mutex_unlock() and synchronize_srcu_expedited()
>> , extending the execution window of the critical area of "kvm->lock)".
>> The motivation is also not explicitly stated in the commit message.
> 
> I'll add a blurb, after I re-convince myself that the sync+request needs to be
> done under kvm->lock.
> 
>>> +	BUILD_BUG_ON(sizeof(((struct kvm_pmu *)0)->reprogram_pmi) >
>>> +		     sizeof(((struct kvm_pmu *)0)->__reprogram_pmi));
>>> +
>>> +	kvm_for_each_vcpu(i, vcpu, kvm)
>>> +		atomic64_set(&vcpu_to_pmu(vcpu)->__reprogram_pmi, -1ull);
>>
>> How about:
>> 	bitmap_copy(pmu->reprogram_pmi, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
>> to avoid further cycles on calls of
>> "static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit)" ?
> 
> bitmap_copy() was my first choice too, but unfortunately it's doesn't guarantee
> atomicity and could lead to data corruption if the target vCPU is concurrently
> modifying the bitmap.

Indeed, it may help to reuse "pmu->global_ctrl_mask" instead of "-1ull":

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 4504987cbbe2..8e279f816e27 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -621,7 +621,8 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void 
__user *argp)
  		     sizeof(((struct kvm_pmu *)0)->__reprogram_pmi));

  	kvm_for_each_vcpu(i, vcpu, kvm)
-		atomic64_set(&vcpu_to_pmu(vcpu)->__reprogram_pmi, -1ull);
+		atomic64_set(&vcpu_to_pmu(vcpu)->__reprogram_pmi,
+			     pmu->global_ctrl_mask);

  	kvm_make_all_cpus_request(kvm, KVM_REQ_PMU);

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index b68956299fa8..a946c1c57e1d 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -185,6 +185,7 @@ static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
  	pmu->nr_arch_fixed_counters = 0;
  	pmu->global_status = 0;
  	bitmap_set(pmu->all_valid_pmc_idx, 0, pmu->nr_arch_gp_counters);
+	pmu->global_ctrl_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1);
  }

  static void amd_pmu_init(struct kvm_vcpu *vcpu)
-- 
2.38.0