lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 11 Nov 2020 10:42:45 +0800
From:   "Xu, Like" <like.xu@...el.com>
To:     Stephane Eranian <eranian@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Like Xu <like.xu@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Kan Liang <kan.liang@...ux.intel.com>, luwei.kang@...el.com,
        Thomas Gleixner <tglx@...utronix.de>,
        "Wang, Wei W" <wei.w.wang@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Mark Gross <mgross@...ux.intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] perf/intel: Remove Perfmon-v4 counter_freezing support

Hi Peter,

On 2020/11/11 4:52, Stephane Eranian wrote:
> On Tue, Nov 10, 2020 at 7:37 AM Peter Zijlstra<peterz@...radead.org>  wrote:
>> On Tue, Nov 10, 2020 at 04:12:57PM +0100, Peter Zijlstra wrote:
>>> On Mon, Nov 09, 2020 at 10:12:37AM +0800, Like Xu wrote:
>>>> The Precise Event Based Sampling(PEBS) supported on Intel Ice Lake server
>>>> platforms can provide an architectural state of the instruction executed
>>>> after the instruction that caused the event. This patch set enables the
>>>> the PEBS via DS feature for KVM (also non) Linux guest on the Ice Lake.
>>>> The Linux guest can use PEBS feature like native:
>>>>
>>>>    # perf record -e instructions:ppp ./br_instr a
>>>>    # perf record -c 100000 -e instructions:pp ./br_instr a
>>>>
>>>> If the counter_freezing is not enabled on the host, the guest PEBS will
>>>> be disabled on purpose when host is using PEBS facility. By default,
>>>> KVM disables the co-existence of guest PEBS and host PEBS.
Thanks Stephane for clarifying the use cases for Freeze-on-[PMI|Overflow].

Please let me express it more clearly.

The goal of the whole patch set is to enable guest PEBS, regardless of
whether the counter_freezing is frozen or not. By default, it will not
support both the guest and the host to use PEBS at the same time.

Please continue reviewing the patch set, especially for the slow path
we proposed this time and related host perf changes:

- add intel_pmu_handle_guest_pebs() to __intel_pmu_pebs_event();
- add switch MSRs (PEBS_ENABLE, DS_AREA, DATA_CFG) to intel_guest_get_msrs();
- the construction of incoming parameters for 
perf_event_create_kernel_counter();

I believe if you understand the general idea, the comments will be very 
valuable.

Thanks,
Like Xu

>>> Uuhh, what?!? counter_freezing should never be enabled, its broken. Let
>>> me go delete all that code.
>> ---
>> Subject: perf/intel: Remove Perfmon-v4 counter_freezing support
>>
>> Perfmon-v4 counter freezing is fundamentally broken; remove this default
>> disabled code to make sure nobody uses it.
>>
>> The feature is called Freeze-on-PMI in the SDM, and if it would do that,
>> there wouldn't actually be a problem,*however*  it does something subtly
>> different. It globally disables the whole PMU when it raises the PMI,
>> not when the PMI hits.
>>
>> This means there's a window between the PMI getting raised and the PMI
>> actually getting served where we loose events and this violates the
>> perf counter independence. That is, a counting event should not result
>> in a different event count when there is a sampling event co-scheduled.
>>
> What is implemented is Freeze-on-Overflow, yet it is described as Freeze-on-PMI.
> That, in itself, is a problem. I agree with you on that point.
>
> However, there are use cases for both modes.
>
> I can sample on event A and count on B, C and when A overflows, I want
> to snapshot B, C.
> For that I want B, C at the moment of the overflow, not at the moment
> the PMI is delivered. Thus, youd
> would want the Freeze-on-overflow behavior. You can collect in this
> mode with the perf tool,
> IIRC: perf record -e '{cycles,instructions,branches:S}' ....
>
> The other usage model is that of the replay-debugger (rr) which you are alluding
> to, which needs precise count of an event including during the skid
> window. For that, you need
> Freeze-on-PMI (delivered). Note that this tool likely only cares about
> user level occurrences of events.
>
> As for counter independence, I am not sure it holds in all cases. If
> the events are setup for user+kernel
> then, as soon as you co-schedule a sampling event, you will likely get
> more counts on the counting
> event due to the additional kernel entries/exits caused by
> interrupt-based profiling. Even if you were to
> restrict to user level only, I would expect to see a few more counts.
>
>
>> This is known to break existing software.
>>
>> Signed-off-by: Peter Zijlstra (Intel)<peterz@...radead.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ