[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92a463cc-608c-4efd-ba78-6da74b99b08b@arm.com>
Date: Fri, 7 Feb 2025 19:18:08 +0000
From: Mark Barnett <mark.barnett@....com>
To: Rob Herring <robh@...nel.org>
Cc: peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
namhyung@...nel.org, irogers@...gle.com, ben.gainey@....com,
deepak.surti@....com, ak@...ux.intel.com, will@...nel.org,
james.clark@....com, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
adrian.hunter@...el.com, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2 1/5] perf: Allow periodic events to alternate between
two sample periods
On 1/31/25 18:44, Rob Herring wrote:
> On Mon, Jan 6, 2025 at 6:12 AM <mark.barnett@....com> wrote:
>>
>> From: Ben Gainey <ben.gainey@....com>
>>
>> This change modifies perf_event_attr to add a second, alternative
>> sample period field, and modifies the core perf overflow handling
>> such that when specified an event will alternate between two sample
>> periods.
>>
>> Currently, perf does not provide a mechanism for decoupling the period
>> over which counters are counted from the period between samples. This is
>> problematic for building a tool to measure per-function metrics derived
>> from a sampled counter group. Ideally such a tool wants a very small
>> sample window in order to correctly attribute the metrics to a given
>> function, but prefers a larger sample period that provides representative
>> coverage without excessive probe effect, triggering throttling, or
>> generating excessive amounts of data.
>>
>> By alternating between a long and short sample_period and subsequently
>> discarding the long samples, tools may decouple the period between
>> samples that the tool cares about from the window of time over which
>> interesting counts are collected.
>>
>> It is expected that typically tools would use this feature with the
>> cycles or instructions events as an approximation for time, but no
>> restrictions are applied to which events this can be applied to.
>>
>> Signed-off-by: Ben Gainey <ben.gainey@....com>
>> Signed-off-by: Mark Barnett <mark.barnett@....com>
>> ---
>> include/linux/perf_event.h | 5 +++++
>> include/uapi/linux/perf_event.h | 3 +++
>> kernel/events/core.c | 37 ++++++++++++++++++++++++++++++++-
>> 3 files changed, 44 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index cb99ec8c9e96..cbb332f4e19c 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -276,6 +276,11 @@ struct hw_perf_event {
>> */
>> u64 freq_time_stamp;
>> u64 freq_count_stamp;
>> +
>> + /*
>> + * Indicates that the alternative sample period is used
>> + */
>> + bool using_alt_sample_period;
>
> 8 bytes more for a single bit of data. I think we can avoid it. More below.
>
>> #endif
>> };
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 0524d541d4e3..499a8673df8e 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -379,6 +379,7 @@ enum perf_event_read_format {
>> #define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
>> #define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
>> #define PERF_ATTR_SIZE_VER8 136 /* add: config3 */
>> +#define PERF_ATTR_SIZE_VER9 144 /* add: alt_sample_period */
>>
>> /*
>> * Hardware event_id to monitor via a performance monitoring event:
>> @@ -531,6 +532,8 @@ struct perf_event_attr {
>> __u64 sig_data;
>>
>> __u64 config3; /* extension of config2 */
>> +
>> + __u64 alt_sample_period;
>> };
>>
>> /*
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 065f9188b44a..7e339d12363a 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -4178,6 +4178,8 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count, bo
>> s64 period, sample_period;
>> s64 delta;
>>
>> + WARN_ON_ONCE(hwc->using_alt_sample_period);
>> +
>> period = perf_calculate_period(event, nsec, count);
>>
>> delta = (s64)(period - hwc->sample_period);
>> @@ -9850,6 +9852,7 @@ static int __perf_event_overflow(struct perf_event*event,
>> int throttle, struct perf_sample_data *data,
>> struct pt_regs *regs)
>> {
>> + struct hw_perf_event *hwc = &event->hw;
>> int events = atomic_read(&event->event_limit);
>> int ret = 0;
>>
>> @@ -9869,6 +9872,18 @@ static int __perf_event_overflow(struct perf_event *event,
>> !bpf_overflow_handler(event, data, regs))
>> goto out;
>>
>> + /*
>> + * Swap the sample period to the alternative period
>> + */
>> + if (event->attr.alt_sample_period) {
>> + bool using_alt = hwc->using_alt_sample_period;
>> + u64 sample_period = (using_alt ? event->attr.sample_period
>> + : event->attr.alt_sample_period);
>> +
>> + hwc->sample_period = sample_period;
>> + hwc->using_alt_sample_period = !using_alt;
>> + }
>
> Wouldn't something like this avoid the need for using_alt_sample_period:
>
> if (event->attr.alt_sample_period) {
> if (hwc->sample_period == event->attr.sample_period)
> hwc->sample_period = event->attr.alt_sample_period;
> else
> hwc->sample_period = event->attr.sample_period;
> }
>
> Rob
Hi Rob,
Thanks for looking it over. That would work for this patch but the
second patch in this series adds a variable jitter to the sample
periods. So 'hwc->sample_period' wouldn't be directly comparable to
'attr.sample_period'.
Mark
Powered by blists - more mailing lists