linux-kernel - Re: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150417182037.GZ2366@two.firstfloor.org>
Date:	Fri, 17 Apr 2015 20:20:37 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"Liang, Kan" <kan.liang@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mingo@...nel.org" <mingo@...nel.org>,
	"acme@...radead.org" <acme@...radead.org>,
	"eranian@...gle.com" <eranian@...gle.com>,
	"andi@...stfloor.org" <andi@...stfloor.org>
Subject: Re: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

On Fri, Apr 17, 2015 at 04:44:07PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 17, 2015 at 02:19:58PM +0000, Liang, Kan wrote:
> 
> > > But that brings us to patch 1 of this series, how is that correct in the face of
> > > this? There is an arbitrary delay (A->B) added to the period.
> > > And the Changelog of course never did bother to make that clear.

That's how perf and other profilers always behaved. The PMI
is not part of the period. The automatic PEBS reload is not in any way
different. It's much faster than a PMI, but it's also not zero cost.

This is not a gap in measurement though -- there is no other code
running during that time on that CPU. It's simply overhead from the
measurement mechanism.

> > 
> > OK. I will update the changelog for patch 1 as below.
> > ---
> > When a fixed period is specified, this patch make perf use the PEBS
> > auto reload mechanism. This makes normal profiling faster, because
> > it avoids one costly MSR write in the PMI handler.
> 
> > However, the reset value will be loaded by hardware assist. There is 
> > a little bit delay compared to previous non-auto-reload mechanism.
> > The delay is arbitrary but very small.
> 
> What is very small? And doesn't that mean its significant at exactly the
> point this patch series is aimed at, namely very short period.

The assist cost is 400-800 cycles, assuming common cases with everything
cached. The minimum period the patch currently uses is 10000. In that
extreme case it can be ~10% if cycles are used.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/