linux-kernel - Re: [PATCH v1 1/4] perf: Allow periodic events to alternate between two sample periods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241114150152.GC39245@noisy.programming.kicks-ass.net>
Date: Thu, 14 Nov 2024 16:01:52 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Deepak Surti <deepak.surti@....com>
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
	mark.barnett@....com, ben.gainey@....com, ak@...ux.intel.com,
	will@...nel.org, james.clark@....com, mark.rutland@....com,
	alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
	irogers@...gle.com, adrian.hunter@...el.com,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v1 1/4] perf: Allow periodic events to alternate between
 two sample periods

On Thu, Nov 07, 2024 at 04:07:18PM +0000, Deepak Surti wrote:
> From: Ben Gainey <ben.gainey@....com>
> 
> This change modifies perf_event_attr to add a second, alternative
> sample period field, and modifies the core perf overflow handling
> such that when specified an event will alternate between two sample
> periods.
> 
> Currently, perf does not provide a  mechanism for decoupling the period
> over which counters are counted from the period between samples. This is
> problematic for building a tool to measure per-function metrics derived
> from a sampled counter group. Ideally such a tool wants a very small
> sample window in order to correctly attribute the metrics to a given
> function, but prefers a larger sample period that provides representative
> coverage without excessive probe effect, triggering throttling, or
> generating excessive amounts of data.
> 
> By alternating between a long and short sample_period and subsequently
> discarding the long samples, tools may decouple the period between
> samples that the tool cares about from the window of time over which
> interesting counts are collected.

Do you have a link to a paper or something that explains this method?


> +	/*
> +	 * Indicates that the alternative_sample_period is used
> +	 */
> +	bool				using_alternative_sample_period;

I typically prefer variables names that are shorter.


> @@ -9822,6 +9825,26 @@ static int __perf_event_overflow(struct perf_event *event,
>  	    !bpf_overflow_handler(event, data, regs))
>  		return ret;
>  
> +	/*
> +	 * Swap the sample period to the alternative period
> +	 */
> +	if (event->attr.alternative_sample_period) {
> +		bool using_alt = hwc->using_alternative_sample_period;
> +		u64 sample_period = (using_alt ? event->attr.sample_period
> +					       : event->attr.alternative_sample_period);
> +
> +		hwc->sample_period = sample_period;
> +		hwc->using_alternative_sample_period = !using_alt;
> +
> +		if (local64_read(&hwc->period_left) > 0) {
> +			event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +			local64_set(&hwc->period_left, 0);
> +
> +			event->pmu->start(event, PERF_EF_RELOAD);
> +		}

This is quite terrible :-(

Getting here means we just went through the effort of programming the
period and you'll pretty much always hit that 'period_left > 0' case.

Why do we need this case at all? If you don't do this, then the next
overflow will pick the period you just wrote to hwc->sample_period
(although you might want to audit all arch implementations).

Looking at it again, that truncation to 0 is just plain wrong -- always.
Why are you doing this?