lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Apr 2024 20:04:42 -0700
From: Mingwei Zhang <mizhang@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Kan Liang <kan.liang@...ux.intel.com>, Dapeng Mi <dapeng1.mi@...ux.intel.com>, 
	maobibo <maobibo@...ngson.cn>, Xiong Zhang <xiong.y.zhang@...ux.intel.com>, pbonzini@...hat.com, 
	peterz@...radead.org, kan.liang@...el.com, zhenyuw@...ux.intel.com, 
	jmattson@...gle.com, kvm@...r.kernel.org, linux-perf-users@...r.kernel.org, 
	linux-kernel@...r.kernel.org, zhiyuan.lv@...el.com, eranian@...gle.com, 
	irogers@...gle.com, samantha.alt@...el.com, like.xu.linux@...il.com, 
	chao.gao@...el.com
Subject: Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU
 state for Intel CPU

On Fri, Apr 26, 2024 at 12:46 PM Sean Christopherson <seanjc@...glecom> wrote:
>
> On Fri, Apr 26, 2024, Kan Liang wrote:
> > > Optimization 4
> > > allows the host side to immediately profiling this part instead of
> > > waiting for vcpu to reach to PMU context switch locations. Doing so
> > > will generate more accurate results.
> >
> > If so, I think the 4 is a must to have. Otherwise, it wouldn't honer the
> > definition of the exclude_guest. Without 4, it brings some random blind
> > spots, right?
>
> +1, I view it as a hard requirement.  It's not an optimization, it's about
> accuracy and functional correctness.

Well. Does it have to be a _hard_ requirement? no? The irq handler
triggered by "perf record -a" could just inject a "state". Instead of
immediately preempting the guest PMU context, perf subsystem could
allow KVM defer the context switch when it reaches the next PMU
context switch location.

This is the same as the preemption kernel logic. Do you want me to
stop the work immediately? Yes (if you enable preemption), or No, let
me finish my job and get to the scheduling point.

Implementing this might be more difficult to debug. That's my real
concern. If we do not enable preemption, the PMU context switch will
only happen at the 2 pairs of locations. If we enable preemption, it
could happen at any time.

>
> What _is_ an optimization is keeping guest state loaded while KVM is in its
> run loop, i.e. initial mediated/passthrough PMU support could land upstream with
> unconditional switches at entry/exit.  The performance of KVM would likely be
> unacceptable for any production use cases, but that would give us motivation to
> finish the job, and it doesn't result in random, hard to diagnose issues for
> userspace.

That's true. I agree with that.

>
> > > Do we want to preempt that? I think it depends. For regular cloud
> > > usage, we don't. But for any other usages where we want to prioritize
> > > KVM/VMM profiling over guest vPMU, it is useful.
> > >
> > > My current opinion is that optimization 4 is something nice to have.
> > > But we should allow people to turn it off just like we could choose to
> > > disable preempt kernel.
> >
> > The exclude_guest means everything but the guest. I don't see a reason
> > why people want to turn it off and get some random blind spots.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ