lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bd55385-0c8e-4989-95be-37862b564dea@linux.intel.com>
Date: Mon, 29 Apr 2024 09:08:42 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Mingwei Zhang <mizhang@...gle.com>,
 Sean Christopherson <seanjc@...gle.com>
Cc: Dapeng Mi <dapeng1.mi@...ux.intel.com>, maobibo <maobibo@...ngson.cn>,
 Xiong Zhang <xiong.y.zhang@...ux.intel.com>, pbonzini@...hat.com,
 peterz@...radead.org, kan.liang@...el.com, zhenyuw@...ux.intel.com,
 jmattson@...gle.com, kvm@...r.kernel.org, linux-perf-users@...r.kernel.org,
 linux-kernel@...r.kernel.org, zhiyuan.lv@...el.com, eranian@...gle.com,
 irogers@...gle.com, samantha.alt@...el.com, like.xu.linux@...il.com,
 chao.gao@...el.com
Subject: Re: [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU
 state for Intel CPU



On 2024-04-26 11:04 p.m., Mingwei Zhang wrote:
> On Fri, Apr 26, 2024 at 12:46 PM Sean Christopherson <seanjc@...gle.com> wrote:
>>
>> On Fri, Apr 26, 2024, Kan Liang wrote:
>>>> Optimization 4
>>>> allows the host side to immediately profiling this part instead of
>>>> waiting for vcpu to reach to PMU context switch locations. Doing so
>>>> will generate more accurate results.
>>>
>>> If so, I think the 4 is a must to have. Otherwise, it wouldn't honer the
>>> definition of the exclude_guest. Without 4, it brings some random blind
>>> spots, right?
>>
>> +1, I view it as a hard requirement.  It's not an optimization, it's about
>> accuracy and functional correctness.
> 
> Well. Does it have to be a _hard_ requirement? no? The irq handler
> triggered by "perf record -a" could just inject a "state". Instead of
> immediately preempting the guest PMU context, perf subsystem could
> allow KVM defer the context switch when it reaches the next PMU
> context switch location.

It depends on what is the upcoming PMU context switch location.
If it's the upcoming VM-exit/entry, the defer should be fine. Because
it's a exclude_guest event, nothing should be counted when a VM is running.
If it's the upcoming vCPU boundary, no. I think there may be several
VM-exit/entry before the upcoming vCPU switch. We may lose some results.
> 
> This is the same as the preemption kernel logic. Do you want me to
> stop the work immediately? Yes (if you enable preemption), or No, let
> me finish my job and get to the scheduling point.

I don't think it's necessary. Just make sure that the counters are
scheduled in the upcoming VM-exit/entry boundary should be fine.

Thanks,
Kan
> 
> Implementing this might be more difficult to debug. That's my real
> concern. If we do not enable preemption, the PMU context switch will
> only happen at the 2 pairs of locations. If we enable preemption, it
> could happen at any time.
> 
>>
>> What _is_ an optimization is keeping guest state loaded while KVM is in its
>> run loop, i.e. initial mediated/passthrough PMU support could land upstream with
>> unconditional switches at entry/exit.  The performance of KVM would likely be
>> unacceptable for any production use cases, but that would give us motivation to
>> finish the job, and it doesn't result in random, hard to diagnose issues for
>> userspace.
> 
> That's true. I agree with that.
> 
>>
>>>> Do we want to preempt that? I think it depends. For regular cloud
>>>> usage, we don't. But for any other usages where we want to prioritize
>>>> KVM/VMM profiling over guest vPMU, it is useful.
>>>>
>>>> My current opinion is that optimization 4 is something nice to have.
>>>> But we should allow people to turn it off just like we could choose to
>>>> disable preempt kernel.
>>>
>>> The exclude_guest means everything but the guest. I don't see a reason
>>> why people want to turn it off and get some random blind spots.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ