lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e34d49b8-4aa2-455a-a623-53a630d484ef@gmail.com>
Date: Thu, 7 Mar 2024 19:07:46 +0800
From: Like Xu <like.xu.linux@...il.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 Mingwei Zhang <mizhang@...gle.com>, Zhenyu Wang <zhenyuw@...ux.intel.com>,
 Zhang Xiong <xiong.y.zhang@...el.com>, Lv Zhiyuan <zhiyuan.lv@...el.com>,
 Dapeng Mi <dapeng1.mi@...el.com>, Jim Mattson <jmattson@...gle.com>,
 Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH] KVM: x86/pmu: Disable support for adaptive PEBS

On 7/3/2024 8:58 am, Sean Christopherson wrote:
> Drop support for virtualizing adaptive PEBS, as KVM's implementation is
> architecturally broken without an obvious/easy path forward, and because
> exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak
> host kernel addresses to the guest.
> 
> Bug #1 is that KVM doesn't doesn't account for the upper 32 bits of
> IA32_FIXED_CTR_CTRL when (re)programming fixed counters, e.g
> fixed_ctrl_field() drops the upper bits, reprogram_fixed_counters()
> stores local variables as u8s and truncates the upper bits too, etc.
> 
> Bug #2 is that, because KVM _always_ sets precise_ip to a non-zero value
> for PEBS events, perf will _always_ generate an adaptive record, even if
> the guest requested a basic record.  Note, KVM will also enable adaptive
> PEBS in individual *counter*, even if adaptive PEBS isn't exposed to the
> guest, but this is benign as MSR_PEBS_DATA_CFG is guaranteed to be zero,
> i.e. the guest will only ever see Basic records.
> 
> Bug #3 is in perf.  intel_pmu_disable_fixed() doesn't clear the upper
> bits either, i.e. leaves ICL_FIXED_0_ADAPTIVE set, and
> intel_pmu_enable_fixed() effectively doesn't clear ICL_FIXED_0_ADAPTIVE
> either.  I.e. perf _always_ enables ADAPTIVE counters, regardless of what
> KVM requests.

The three issues above all point to a fix in one direction: to pass the value
of vcpu's EVT_SELx.Adaptive_Record[34] or FCx_Adaptive_Record to the
perf/core in a way and let the PEBS assist take effect as expected by vPEBS.

One place to address this is in the intel_guest_get_msrs() again:
- update vPMC[x].pPMC.fcctl_or_evtsel.ADAPTIVE = vPMC[x].use_adaptive
, since guest PEBS is disabled if host PEBS is enabled.

> 
> Bug #4 is that adaptive PEBS *might* effectively bypass event filters set
> by the host, as "Updated Memory Access Info Group" records information
> that might be disallowed by userspace via KVM_SET_PMU_EVENT_FILTER.

This could be seen as a missing feature, that is, whether PMU_EVENT_FILTER
can control PEBS events even if they share the same event encoding.

Furthermore, if LBR_FMT is cleared only by VMM, could the guest use
adaptive pebs to obtain valid guest_lbr records. It's open in the virt context
since real hardware doesn't have this issue.

> 
> Bug #5 is that KVM doesn't ensure LBR MSRs hold guest values (or at least
> zeros) when entering a vCPU with adaptive PEBS, which allows the guest
> to read host LBRs, i.e. host RIPs/addresses, by enabling "LBR Entries"
> records.
> 
> Disable adaptive PEBS support as an immediate fix due to the severity of
> the LBR leak in particular, and because fixing all of the bugs will be
> non-trivial, e.g. not suitable for backporting to stable kernels.
> 
> Note!  This will break live migration, but trying to make KVM play nice
> with live migration would be quite complicated, wouldn't be guaranteed to
> work (i.e. KVM might still kill/confuse the guest), and it's not clear
> that there are any publicly available VMMs that support adaptive PEBS,
> let alone live migrate VMs that support adaptive PEBS, e.g. QEMU doesn't
> support PEBS in any capacity.
> 
> Link: https://lore.kernel.org/all/20240306230153.786365-1-seanjc@google.com
> Link: https://lore.kernel.org/all/ZeepGjHCeSfadANM@google.com
> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
> Cc: stable@...r.kernel.org
> Cc: Like Xu <like.xu.linux@...il.com>
> Cc: Mingwei Zhang <mizhang@...gle.com>
> Cc: Zhenyu Wang <zhenyuw@...ux.intel.com>
> Cc: Zhang Xiong <xiong.y.zhang@...el.com>
> Cc: Lv Zhiyuan <zhiyuan.lv@...el.com>
> Cc: Dapeng Mi <dapeng1.mi@...el.com>
> Cc: Jim Mattson <jmattson@...gle.com>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>

Acked-by: Like Xu <likexu@...cent.com>

> ---
>   arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++--
>   1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7a74388f9ecf..641a7d5bf584 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7864,8 +7864,28 @@ static u64 vmx_get_perf_capabilities(void)
>   
>   	if (vmx_pebs_supported()) {
>   		perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
> -		if ((perf_cap & PERF_CAP_PEBS_FORMAT) < 4)
> -			perf_cap &= ~PERF_CAP_PEBS_BASELINE;
> +
> +		/*
> +		 * Disallow adaptive PEBS as it is functionally broken, can be
> +		 * used by the guest to read *host* LBRs, and can be used to
> +		 * bypass userspace event filters.  To correctly and safely
> +		 * support adaptive PEBS, KVM needs to:
> +		 *
> +		 * 1. Account for the ADAPTIVE flag when (re)programming fixed
> +		 *    counters.
> +		 *
> +		 * 2. Gain support from perf (or take direct control of counter
> +		 *    programming) to support events without adaptive PEBS
> +		 *    enabled for the hardware counter.
> +		 *
> +		 * 3. Ensure LBR MSRs cannot hold host data on VM-Entry with
> +		 *    adaptive PEBS enabled and MSR_PEBS_DATA_CFG.LBRS=1.
> +		 *
> +		 * 4. Document which PMU events are effectively exposed to the
> +		 *    guest via adaptive PEBS, and make adaptive PEBS mutually
> +		 *    exclusive with KVM_SET_PMU_EVENT_FILTER if necessary.
> +		 */
> +		perf_cap &= ~PERF_CAP_PEBS_BASELINE;
>   	}
>   
>   	return perf_cap;
> 
> base-commit: 0c64952fec3ea01cb5b09f00134200f3e7ab40d5

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ