lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Jul 2014 12:41:39 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Yan, Zheng" <zheng.z.yan@...el.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org, acme@...radead.org,
	eranian@...gle.com, andi@...stfloor.org
Subject: Re: [PATCH v2 4/7] perf, x86: large PEBS interrupt threshold

On Tue, Jul 15, 2014 at 04:58:56PM +0800, Yan, Zheng wrote:
> PEBS always had the capability to log samples to its buffers without
> an interrupt. Traditionally perf has not used this but always set the
> PEBS threshold to one.
> 
> For frequently occuring events (like cycles or branches or load/stores)
> this in term requires using a relatively high sampling period to avoid
> overloading the system, by only processing PMIs. This in term increases
> sampling error.
> 
> For the common cases we still need to use the PMI because the PEBS
> hardware has various limitations. The biggest one is that it can not
> supply a callgraph. It also requires setting a fixed period, as the
> hardware does not support adaptive period. Another issue is that it
> cannot supply a time stamp and some other options. To supply a TID it
> requires flushing on context switch. It can however supply the IP, the
> load/store address, TSX information, registers, and some other things.
> 
> So we can make PEBS work for some specific cases, basically as long as
> you can do without a callgraph and can set the period you can use this
> new PEBS mode.
> 
> The main benefit is the ability to support much lower sampling period
> (down to -c 1000) without extensive overhead.
> 
> One use cases is for example to increase the resolution of the c2c tool.
> Another is double checking when you suspect the standard sampling has
> too much sampling error.
> 
> Some numbers on the overhead, using cycle soak, comparing
> "perf record --no-time -e cycles:p -c" to "perf record -e cycles:p -c"
> 
> period    plain  multi  delta
> 10003     15     5      10
> 20003     15.7   4      11.7
> 40003     8.7    2.5    6.2
> 80003     4.1    1.4    2.7
> 100003    3.6    1.2    2.4
> 800003    4.4    1.4    3
> 1000003   0.6    0.4    0.2
> 2000003   0.4    0.3    0.1
> 4000003   0.3    0.2    0.1
> 10000003  0.3    0.2    0.1
> 
> The interesting part is the delta between multi-pebs and normal pebs. Above
> -c 1000003 it does not really matter because the basic overhead is so low.
> With periods below 80003 it becomes interesting.
> 
> Note in some other workloads (e.g. kernbench) the smaller sampling periods
> cause much more overhead without multi-pebs,  upto 80% (and throttling) have
> been observed with -c 10003. multi pebs generally does not throttle.
> 

And not a single word on the multiplex horror we talked about. That
should be mentioned, in detail.

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ