lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 15 Aug 2014 16:31:21 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	Andi Kleen <ak@...ux.intel.com>,
	Namhyung Kim <namhyung@...nel.org>,
	Jiri Olsa <jolsa@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Andi Kleen <andi@...stfloor.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH 4/5] perf, x86: Add INST_RETIRED.ALL workarounds

On Thu, Aug 14, 2014 at 07:47:56PM +0200, Stephane Eranian wrote:
> [+perf tool maintainers]
> 
> On Thu, Aug 14, 2014 at 4:30 PM, Andi Kleen <ak@...ux.intel.com> wrote:
> >
> > I understand all your points, but there's no alternative.
> > The only other way would be to disable INST_RETIRED.ALL.
> >
> You cannot do that either. INST_RETIRED:ALL is important.  I assume
> the bug applies whether or not the event is used with a filter.
> 
> I think we need to ensure that by looking at the perf.data file, one
> can reconstruct the total number of inst_Retired:all occurrences for
> the run. With a fixed period, one would do num_samples * fixed_period.
> I know the Gooda tool does that. It is used to estimate the number of
> events captured vs. the number of events occurring.

OK, I think we can make that work; IFF we guarantee
perf_event_attr::sample_period >= 128.

Suppose we start out with sample_period=192; then we'll set period_left
to 192, we'll end up with left = 128 (we truncate the lower bits). We
get an interrupt, find that period_left = 64 (>0 so we return 0 and
don't get an overflow handler), up that to 128. Then we trigger again,
at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
an overflow). We increment with sample_period so we get left = 128. We
fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
overflow). And on and on.

So while the individual interrupts are 'wrong' we get then with
interval=256,128 in exactly the right ratio to average out at 192. And
this works for everything >=128.

So the num_samples*fixed_period thing is still entirely correct +- 127,
which is good enough I'd say, as you already have that error anyhow.

So no need to 'fix' the tools, al we need to do is refuse to create
INST_RETIRED:ALL events with sample_period < 128.

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ