lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d024d71-0b02-4481-a0d4-f1786313c1e7@linux.intel.com>
Date: Thu, 15 May 2025 10:53:39 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>,
 Mingwei Zhang <mizhang@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>,
 Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
 Adrian Hunter <adrian.hunter@...el.com>, Liang@...gle.com,
 Kan <kan.liang@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>,
 linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 kvm@...r.kernel.org, linux-kselftest@...r.kernel.org,
 Yongwei Ma <yongwei.ma@...el.com>,
 Xiong Zhang <xiong.y.zhang@...ux.intel.com>,
 Jim Mattson <jmattson@...gle.com>, Sandipan Das <sandipan.das@....com>,
 Zide Chen <zide.chen@...el.com>, Eranian Stephane <eranian@...gle.com>,
 Shukla Manali <Manali.Shukla@....com>,
 Nikunj Dadhania <nikunj.dadhania@....com>
Subject: Re: [PATCH v4 14/38] KVM: x86/pmu: Introduce enable_mediated_pmu
 global parameter


On 5/15/2025 8:09 AM, Sean Christopherson wrote:
> On Mon, Mar 24, 2025, Mingwei Zhang wrote:
>> From: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>>
>> Introduce enable_mediated_pmu global parameter to control if mediated
>> vPMU can be enabled on KVM level. Even enable_mediated_pmu is set to
>> true in KVM, user space hypervisor still need to enable mediated vPMU
>> explicitly by calling KVM_CAP_PMU_CAPABILITY ioctl. This gives
>> hypervisor flexibility to enable or disable mediated vPMU for each VM.
>>
>> Mediated vPMU depends on some PMU features on higher PMU version, like
>> PERF_GLOBAL_STATUS_SET MSR in v4+ for Intel PMU. Thus introduce a
>> pmu_ops variable MIN_MEDIATED_PMU_VERSION to indicates the minimum host
>> PMU version which mediated vPMU needs.
>>
>> Currently enable_mediated_pmu is not exposed to user space as a module
>> parameter until all mediated vPMU code are in place.
>>
>> Suggested-by: Sean Christopherson <seanjc@...gle.com>
>> Co-developed-by: Mingwei Zhang <mizhang@...gle.com>
>> Signed-off-by: Mingwei Zhang <mizhang@...gle.com>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>> ---
>>  arch/x86/kvm/pmu.c              |  3 ++-
>>  arch/x86/kvm/pmu.h              | 11 +++++++++
>>  arch/x86/kvm/svm/pmu.c          |  1 +
>>  arch/x86/kvm/vmx/capabilities.h |  3 ++-
>>  arch/x86/kvm/vmx/pmu_intel.c    |  5 ++++
>>  arch/x86/kvm/vmx/vmx.c          |  3 ++-
>>  arch/x86/kvm/x86.c              | 44 ++++++++++++++++++++++++++++++---
>>  arch/x86/kvm/x86.h              |  1 +
>>  8 files changed, 64 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 75e9cfc689f8..4f455afe4009 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -775,7 +775,8 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
>>  	pmu->pebs_data_cfg_rsvd = ~0ull;
>>  	bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
>>  
>> -	if (!vcpu->kvm->arch.enable_pmu)
>> +	if (!vcpu->kvm->arch.enable_pmu ||
>> +	    (!lapic_in_kernel(vcpu) && enable_mediated_pmu))
> This check belongs in KVM_CAP_PMU_CAPABILITY, i.e. KVM needs to reject enabling
> a mediated PMU without an in-kernel local APIC, not silently drop the PMU.

Good idea.


>
>>  		return;
>>  
>>  	kvm_pmu_call(refresh)(vcpu);
>> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
>> index ad89d0bd6005..dd45a0c6be74 100644
>> --- a/arch/x86/kvm/pmu.h
>> +++ b/arch/x86/kvm/pmu.h
>> @@ -45,6 +45,7 @@ struct kvm_pmu_ops {
>>  	const u64 EVENTSEL_EVENT;
>>  	const int MAX_NR_GP_COUNTERS;
>>  	const int MIN_NR_GP_COUNTERS;
>> +	const int MIN_MEDIATED_PMU_VERSION;
> I like the idea, but simply checking the PMU version is insufficient on Intel,
> i.e. just add a callback.

sure.


>
>>  };
>>  
>>  void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops);
>> @@ -63,6 +64,12 @@ static inline bool kvm_pmu_has_perf_global_ctrl(struct kvm_pmu *pmu)
>>  	return pmu->version > 1;
>>  }
>>  
>> +static inline bool kvm_mediated_pmu_enabled(struct kvm_vcpu *vcpu)
> kvm_vcpu_has_mediated_pmu() to align with e.g. guest_cpu_cap_has(), and because
> kvm_mediated_pmu_enabled() sounds like a VM-scoped or module-scoped helper.

exactly.


>
>> +{
>> +	return vcpu->kvm->arch.enable_pmu &&
> This is superfluous, pmu->version should never be non-zero without the PMU being
> enabled at the VM level.

Strictly speaking, "arch.enable_pmu" and pmu->version doesn't indicates
fully same thing.  "arch.enable_pmu" indicates whether PMU function is
enabled in KVM, but the "pmu->version" comes from user space configuration.
In theory user space could configure a "0"  PMU version just like
pmu_counters_test does. Currently I'm not sure if the check for
"pmu->version" can be removed, let me have a double check.


>
>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>> index 77012b2eca0e..425e93d4b1c6 100644
>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> @@ -739,4 +739,9 @@ struct kvm_pmu_ops intel_pmu_ops __initdata = {
>>  	.EVENTSEL_EVENT = ARCH_PERFMON_EVENTSEL_EVENT,
>>  	.MAX_NR_GP_COUNTERS = KVM_MAX_NR_INTEL_GP_COUNTERS,
>>  	.MIN_NR_GP_COUNTERS = 1,
>> +	/*
>> +	 * Intel mediated vPMU support depends on
>> +	 * MSR_CORE_PERF_GLOBAL_STATUS_SET which is supported from 4+.
>> +	 */
>> +	.MIN_MEDIATED_PMU_VERSION = 4,
>>  };
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 00ac94535c21..a4b5b6455c7b 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7916,7 +7916,8 @@ static __init u64 vmx_get_perf_capabilities(void)
>>  	if (boot_cpu_has(X86_FEATURE_PDCM))
>>  		rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
>>  
>> -	if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) {
>> +	if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR) &&
>> +	    !enable_mediated_pmu) {
>>  		x86_perf_get_lbr(&vmx_lbr_caps);
>>  
>>  		/*
> There's a bit too much going on in this patch.  I think it makes sense to split
> the vendor chunks out to separate patches, so that each can elaborate on the
> exact requirements.

Sure.


>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 72995952978a..1ebe169b88b6 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -188,6 +188,14 @@ bool __read_mostly enable_pmu = true;
>>  EXPORT_SYMBOL_GPL(enable_pmu);
>>  module_param(enable_pmu, bool, 0444);
>>  
>> +/*
>> + * Enable/disable mediated passthrough PMU virtualization.
>> + * Don't expose it to userspace as a module paramerter until
>> + * all mediated vPMU code is in place.
>> + */
> No need for the comment, documenting this in the changelog is sufficient.

Sure.


>
>> +bool __read_mostly enable_mediated_pmu;
>> +EXPORT_SYMBOL_GPL(enable_mediated_pmu);
>> +
>>  bool __read_mostly eager_page_split = true;
>>  module_param(eager_page_split, bool, 0644);
>>  
>> @@ -6643,9 +6651,28 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>  			break;
>>  
>>  		mutex_lock(&kvm->lock);
>> -		if (!kvm->created_vcpus) {
>> -			kvm->arch.enable_pmu = !(cap->args[0] & KVM_PMU_CAP_DISABLE);
>> -			r = 0;
>> +		/*
>> +		 * To keep PMU configuration "simple", setting vPMU support is
>> +		 * disallowed if vCPUs are created, or if mediated PMU support
>> +		 * was already enabled for the VM.
>> +		 */
>> +		if (!kvm->created_vcpus &&
>> +		    (!enable_mediated_pmu || !kvm->arch.enable_pmu)) {
>> +			bool pmu_enable = !(cap->args[0] & KVM_PMU_CAP_DISABLE);
>> +
>> +			if (enable_mediated_pmu && pmu_enable) {
> Local APIC check goes here.

Yes.


>
>> +				char *err_msg = "Fail to enable mediated vPMU, " \
>> +					"please disable system wide perf events or nmi_watchdog " \
>> +					"(echo 0 > /proc/sys/kernel/nmi_watchdog).\n";
>> +
>> +				r = perf_get_mediated_pmu();
>> +				if (r)
>> +					kvm_err("%s", err_msg);
>
> #define MEDIATED_PMU_MSG "Fail to enable mediated vPMU, disable system wide perf events and nmi_watchdog.\n"

Sure.


>
> 				r = perf_create_mediated_pmu();
> 				if (r)
> 					kvm_err(MEDIATED_PMU_MSG);
>
>> +			} else
>> +				r = 0;
>> +
>> +			if (!r)
>> +				kvm->arch.enable_pmu = pmu_enable;
>>  		}
>>  		mutex_unlock(&kvm->lock);
>>  		break;
>> @@ -12723,7 +12750,14 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>  	kvm->arch.default_tsc_khz = max_tsc_khz ? : tsc_khz;
>>  	kvm->arch.apic_bus_cycle_ns = APIC_BUS_CYCLE_NS_DEFAULT;
>>  	kvm->arch.guest_can_read_msr_platform_info = true;
>> -	kvm->arch.enable_pmu = enable_pmu;
>> +
>> +	/*
>> +	 * PMU virtualization is opt-in when mediated PMU support is enabled.
>> +	 * KVM_CAP_PMU_CAPABILITY ioctl must be called explicitly to enable
>> +	 * mediated vPMU. For legacy perf-based vPMU, its behavior isn't changed,
>> +	 * KVM_CAP_PMU_CAPABILITY ioctl is optional.
>> +	 */
> Again, too much extraneous info, the exception proves the rule.  I.e. by calling
> out that mediated PMU is special, it's clear the rule is that PMUs are enabled by
> default in the !mediated case.

Sure.


>
> 	/*
> 	 * Userspace must explicitly opt-in to PMU virtualization when mediated
> 	 * PMU support is enabled (see KVM_CAP_PMU_CAPABILITY).
> 	 */
>
>> +	kvm->arch.enable_pmu = enable_pmu && !enable_mediated_pmu;
> So I tried to run a QEMU with this and it failed, because QEMU expected the PMU
> to be enabled and tried to write to PMU MSRs.  I haven't dug through the QEMU
> code, but I assume that QEMU rightly expects that passing in PMU in CPUID when
> KVM_GET_SUPPORTED_CPUID says its supported will result in the VM having a PMU.

As long as the module parameter "enable_mediated_pmu" is enabled, qemu
needs below extra code to enable mediated vPMU, otherwise PMU is disabled
in KVM.

https://lore.kernel.org/all/20250324123712.34096-1-dapeng1.mi@linux.intel.com/

> I.e. by trying to get cute with backwards compatibility, I think we broke backwards
> compatiblity.  At this point, I'm leaning toward making the module param off-by-default,
> but otherwise not messing with the behavior of kvm->arch.enable_pmu.  Not sure if
> that has implications for KVM_PMU_CAP_DISABLE though.

I'm not sure if it's a kind of break for backwards compatibility.  As long
as "enable_mediated_pmu" is not enabled, the qemu doesn't need any changes,
the legacy vPMU can still be enabled by old qemu version. But if user want
to enable mediated vPMU, so they should use the new version qemu which has
the capability to enable mediated vPMU, it sounds reasonable for me.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ