lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090622121616.GB6760@one.firstfloor.org>
Date:	Mon, 22 Jun 2009 14:16:16 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	eranian@...il.com, LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Robert Richter <robert.richter@....com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Mackerras <paulus@...ba.org>,
	Andi Kleen <andi@...stfloor.org>,
	Maynard Johnson <mpjohn@...ibm.com>,
	Carl Love <cel@...ibm.com>,
	Corey J Ashford <cjashfor@...ibm.com>,
	Philip Mucci <mucci@...s.utk.edu>,
	Dan Terpstra <terpstra@...s.utk.edu>,
	perfmon2-devel <perfmon2-devel@...ts.sourceforge.net>
Subject: Re: IV.4 - Intel PEBS

On Mon, Jun 22, 2009 at 02:00:52PM +0200, Ingo Molnar wrote:
> Having said that, PEBS is a hardware sampling feature that is
> definitely saner than AMD's IBS. There's two immediate incremental
> uses of it in perfcounters:
> 
>  - it makes flat sampling lower overhead by avoiding an NMI for all
>    sample points.
> 
>  - it makes flat sampled data more precise. (I.e. it can avoid the
>    1-2 instructions 'skidding' of a sample position, for a handful

There are realistic examples where the non pebs "shadow" can be far more
than that, even giving systematic error and hiding complete basic blocks.

>    of PEBS-capable events.)

There are a more reasons, especially on Nehalem there are some useful things
you can only measure with PEBS. e.g. memory latency or address
histograms (although the later is quite complicated). Also it 
has a lot more PEBS capable events than older parts.

Long term the trend is likely that more and more advanced PMU features
will require PEBS.

> Regarding demultiplexing on Nehalem: PEBS goes into the DS (Data
> Store), and indeed on Nehalem all PEBS counters 'mix' their PEBS
> records in the same stream of data. One possible model to support
> them is to set the PEBS threshold to one, and hence generate an
> interrupt for each PEBS record. At offset 0x90 of the PEBS record we

Then you would need the NMIs again, the NMI avoidance in PEBS only
works with higher thresholds.

> As to enabling PEBS with the (CPU-)global latency recording filters,
> we can do this transparantly for every PEBS supported event, or can
> mandate PEBS scheduling when a PEBS only feature like load latency
> is requested.
> 
> This means that for most purposes PEBS will be transparant.

One disadvantage here is that you're giving away a lot of measuring 
overhead: interrupts are always much more costly than a PEBS event directly
written by the CPU.

But when you support batching multiple PEBS events I suspect the consumer
needs to be aware of the limitations,  e.g. no precise time stamps.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ