lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150417182037.GZ2366@two.firstfloor.org>
Date:	Fri, 17 Apr 2015 20:20:37 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"Liang, Kan" <kan.liang@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mingo@...nel.org" <mingo@...nel.org>,
	"acme@...radead.org" <acme@...radead.org>,
	"eranian@...gle.com" <eranian@...gle.com>,
	"andi@...stfloor.org" <andi@...stfloor.org>
Subject: Re: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

On Fri, Apr 17, 2015 at 04:44:07PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 17, 2015 at 02:19:58PM +0000, Liang, Kan wrote:
> 
> > > But that brings us to patch 1 of this series, how is that correct in the face of
> > > this? There is an arbitrary delay (A->B) added to the period.
> > > And the Changelog of course never did bother to make that clear.

That's how perf and other profilers always behaved. The PMI
is not part of the period. The automatic PEBS reload is not in any way
different. It's much faster than a PMI, but it's also not zero cost.

This is not a gap in measurement though -- there is no other code
running during that time on that CPU. It's simply overhead from the
measurement mechanism.

> > 
> > OK. I will update the changelog for patch 1 as below.
> > ---
> > When a fixed period is specified, this patch make perf use the PEBS
> > auto reload mechanism. This makes normal profiling faster, because
> > it avoids one costly MSR write in the PMI handler.
> 
> > However, the reset value will be loaded by hardware assist. There is 
> > a little bit delay compared to previous non-auto-reload mechanism.
> > The delay is arbitrary but very small.
> 
> What is very small? And doesn't that mean its significant at exactly the
> point this patch series is aimed at, namely very short period.

The assist cost is 400-800 cycles, assuming common cases with everything
cached. The minimum period the patch currently uses is 10000. In that
extreme case it can be ~10% if cycles are used.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ