linux-kernel - Re: [PATCH 1/3] KVM: x86: Refresh PMU after writes to MSR_IA32_PERF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5090d500-1549-79ba-53a9-4929114eb569@gmail.com>
Date:   Fri, 29 Jul 2022 17:33:47 +0800
From:   Like Xu <like.xu.linux@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] KVM: x86: Refresh PMU after writes to
 MSR_IA32_PERF_CAPABILITIES

On 28/7/2022 11:27 pm, Sean Christopherson wrote:
> On Thu, Jul 28, 2022, Like Xu wrote:
>> On 28/7/2022 7:34 am, Sean Christopherson wrote:
>>> Refresh the PMU if userspace modifies MSR_IA32_PERF_CAPABILITIES.  KVM
>>> consumes the vCPU's PERF_CAPABILITIES when enumerating PEBS support, but
>>> relies on CPUID updates to refresh the PMU.  I.e. KVM will do the wrong
>>> thing if userspace stuffs PERF_CAPABILITIES _after_ setting guest CPUID.
>>
>> Unwise userspace should reap its consequences if it does not break KVM or host.
> 
> I don't think this is a case of userspace being weird or unwise.  IMO, setting
> CPUID before MSRs is perfectly logical and intuitive.

The concern is whether to allow changing the semantically featured MSR value
(as an alternative to CPUID or KVM_CAP.) from user space after the guest CPUID
is finalized or the guest has run for a while.

Changing the presence semantics of related CPUID via a post-written msr-feature,
or vice versa, is seen as a user-space ill-advisedness. Based on the ill-advisedness
of the user space input, KVM's strange behaviour is to be expected. Right ?

A wise user space should take care of both PEBS CPUID and PEBS fields
in the PERF_CAPABILITIES, in whatever time order they are passed to KVM.
KVM implementation should treat them as equivalent for any availability check
(regardless of performance issue, it's my bad to traverse CPUID rathe than 
perf_cap).

If two or more settings cannot be coordinated with each other in the user space 
level,
KVM must choose to rely on one setting or another or check all settings (more 
expensive).

> 
>> When a guest feature can be defined/controlled by multiple KVM APIs entries,
>> (such as SET_CPUID2, msr_feature, KVM_CAP, module_para), should KVM
>> define the priority of these APIs (e.g. whether they can override each other) ?
> 
> KVM does have "rules" in the sense that it has an established ABI for things
> like KVM_CAP and module params, though documentation may be lacking in some cases.
> The CPUID and MSR ioctls don't have a prescribe ordering though.

Should we continue with this inter-dependence (as a silent feature) ?
The patch implies that it should be left as it is in order not to break any user 
space.

How we break out of this rut ?

> 
>> Removing this ambiguity ensures consistency in the architecture and behavior
>> of all KVM features.
> 
> Agreed, but the CPUID and MSR ioctls (among many others) have existed for quite
> some time.  KVM likely can't retroactively force a specific order without breaking
> one userspace or another.
> 
>> Any further performance optimizations can be based on these finalized values
>> as you do.
>>
>>>
>>> Opportunistically fix a curly-brace indentation.
>>>
>>> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
>>> Cc: Like Xu <like.xu.linux@...il.com>
>>> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
>>> ---
>>>    arch/x86/kvm/x86.c | 4 ++--
>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 5366f884e9a7..362c538285db 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -3543,9 +3543,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>    			return 1;
>>>    		vcpu->arch.perf_capabilities = data;
>>> -
>>> +		kvm_pmu_refresh(vcpu);
>>
>> I had proposed this diff but was met with silence.
> 
> My apologies, I either missed it or didn't connect the dots.