lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aBy2AGIFi34q031x@google.com>
Date: Thu, 8 May 2025 06:47:44 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Dapeng Mi <dapeng1.mi@...ux.intel.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Seth Forshee <sforshee@...nel.org>
Subject: Re: [PATCH] perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest
 with vCPU's value.

On Tue, May 06, 2025, Dapeng Mi wrote:
> 
> On 5/2/2025 11:04 PM, Sean Christopherson wrote:
> > On Sun, Apr 27, 2025, Dapeng Mi wrote:
> >> On 4/26/2025 8:13 AM, Sean Christopherson wrote:
> >> Currently we have this Sean's fix, only the guest PEBS event bits of
> >> IA32_PEBS_ENABLE MSR are enabled in non-root mode, suppose we can simply
> >> change global_ctrl guest value calculation to this.
> >>
> >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> >> index 09d2d66c9f21..5bc56bb616ec 100644
> >> --- a/arch/x86/events/intel/core.c
> >> +++ b/arch/x86/events/intel/core.c
> >> @@ -4342,9 +4342,12 @@ static struct perf_guest_switch_msr
> >> *intel_guest_get_msrs(int *nr, void *data)
> >>         arr[global_ctrl] = (struct perf_guest_switch_msr){
> >>                 .msr = MSR_CORE_PERF_GLOBAL_CTRL,
> >>                 .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
> >> -               .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask,
> >> +               .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask,
> >>         };
> > Hmm, that's not as clear cut.  PEBS needs to be disabled because leaving it enabled
> > will crash the guest.  For the counter itself, unless leaving it enabled breaks
> > perf and/or degrades the sampling, I don't think there's an obvious right/wrong
> > approach.
> >
> > E.g. if the host wants to profile host and guest, then keeping the count running
> > while the guest is active might be a good thing.  It's still far, far from
> > perfect, as a counter that overflows when the guest is active won't generate a
> > PEBS record, but without digging further, it's not obvious that even that flaw
> > is overall worse than always disabling the counter.
> 
> Hmm, if the host PEBS event only samples host side, whether the HW counter
> or the PEBS engine would be disabled once VM enters non-root mode, the KVM
> PEBS implementation is correct. But for the host PEBS events which sampling
> both host and guest, the implementation seems incorrect.

Well, yeah, because there is no correct implementation.

> As the below code shows, as long as there are host PEBS events regardless
> of the host PEBS events only sample guest or both host and guest, the host
> PEBS events would be disabled on both HW counters and PEBS engine once VM
> enters non-root mode.
> 
>     arr[global_ctrl] = (struct perf_guest_switch_msr){
>         .msr = MSR_CORE_PERF_GLOBAL_CTRL,
>         .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
>         .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask,
>     };
> 
>     if (arr[pebs_enable].host) {
>         /* Disable guest PEBS if host PEBS is enabled. */
>         arr[pebs_enable].guest = 0;
> 
>     }
> 
> So the host PEBS events which hopes to sample both host and guest only
> samples host side in fact. This is unexpected.

It's only unexpected in the sense that it's probably not well documented.  Because
the DS buffer is virtually address, there simply isn't a sane way to enable PEBS
(or any feature that utizies the DS buffer) while running a KVM guest that isn't
enlightened to explicitly allow profiling via host PEBS (and AFAIK, no such guest
exists).

Even when KVM is using shadowing paging, i.e. fully controls the page tables used
while the guest is running, enabling PEBS isn't feasible as KVM has no way to
prevent the guest from using the virtual address.  E.g. KVM could shove in mappings
for the DS buffer, but that DoS the guest if the guest wants to use the same
virtual address range for its own purposes, and would be a massive data leak to
the guest since the guest could read host data from the buffer.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ