[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240801140340.GF37996@noisy.programming.kicks-ass.net>
Date: Thu, 1 Aug 2024 16:03:40 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: kan.liang@...ux.intel.com
Cc: mingo@...nel.org, acme@...nel.org, namhyung@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com,
alexander.shishkin@...ux.intel.com, linux-kernel@...r.kernel.org,
ak@...ux.intel.com, eranian@...gle.com,
Sandipan Das <sandipan.das@....com>,
Ravi Bangoria <ravi.bangoria@....com>,
silviazhao <silviazhao-oc@...oxin.com>
Subject: Re: [PATCH V4 1/5] perf/x86: Extend event update interface
On Wed, Jul 31, 2024 at 07:38:31AM -0700, kan.liang@...ux.intel.com wrote:
> From: Kan Liang <kan.liang@...ux.intel.com>
>
> The current event update interface directly reads the values from the
> counter, but the values may not be the accurate ones users require. For
> example, the sample read feature wants the counter value of the member
> events when the leader event is overflow. But with the current
> implementation, the read (event update) actually happens in the NMI
> handler. There may be a small gap between the overflow and the NMI
> handler.
This...
> The new Intel PEBS counters snapshotting feature can provide
> the accurate counter value in the overflow. The event update interface
> has to be updated to apply the given accurate values.
>
> Pass the accurate values via the event update interface. If the value is
> not available, still directly read the counter.
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index 12f2a0c14d33..07a56bf71160 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -112,7 +112,7 @@ u64 __read_mostly hw_cache_extra_regs
> * Can only be executed on the CPU where the event is active.
> * Returns the delta events processed.
> */
> -u64 x86_perf_event_update(struct perf_event *event)
> +u64 x86_perf_event_update(struct perf_event *event, u64 *val)
> {
> struct hw_perf_event *hwc = &event->hw;
> int shift = 64 - x86_pmu.cntval_bits;
> @@ -131,7 +131,10 @@ u64 x86_perf_event_update(struct perf_event *event)
> */
> prev_raw_count = local64_read(&hwc->prev_count);
> do {
> - rdpmcl(hwc->event_base_rdpmc, new_raw_count);
> + if (!val)
> + rdpmcl(hwc->event_base_rdpmc, new_raw_count);
> + else
> + new_raw_count = *val;
> } while (!local64_try_cmpxchg(&hwc->prev_count,
> &prev_raw_count, new_raw_count));
>
Does that mean the following is possible?
Two counters: C0 and C1, where C0 is a PEBS counter that also samples
C1.
C0: overflow-with-PEBS-assist -> PEBS entry with counter value A
(DS buffer threshold not reached)
C1: overflow -> PMI -> x86_perf_event_update(C1, NULL)
rdpmcl reads value 'A+d', and sets prev_raw_count
C0: more assists, hit DS threshold -> PMI
PEBS processing does x86_perf_event_update(C1, A)
and sets prev_raw_count *backwards*
How is that sane?
Powered by blists - more mailing lists