lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151005212045.GG26924@tassilo.jf.intel.com>
Date:	Mon, 5 Oct 2015 14:20:45 -0700
From:	Andi Kleen <ak@...ux.intel.com>
To:	Jesper Dangaard Brouer <brouer@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	Christoph Lameter <cl@...ux.com>,
	Alexander Duyck <alexander.duyck@...il.com>,
	Pekka Enberg <penberg@...nel.org>, netdev@...r.kernel.org,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	David Rientjes <rientjes@...gle.com>,
	Hannes Frederic Sowa <hannes@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB
 freelists

> My only problem left, is I want a perf measurement that pinpoint these
> kind of spots.  The difference in L1-icache-load-misses were significant
> (1,278,276 vs 2,719,158).  I tried to somehow perf record this with
> different perf events without being able to pinpoint the location (even
> though I know the spot now).  Even tried Andi's ocperf.py... maybe he
> will know what event I should try?

Run pmu-tools toplev.py -l3 with --show-sample. It tells you what the
bottle neck is and what to sample for if there is a suitable event and
even prints the command line.

https://github.com/andikleen/pmu-tools/wiki/toplev-manual#sampling-with-toplev

However frontend issues are difficult to sample, as they happen very far
away from instruction retirement where the sampling happens. So you may
have large skid and the sampling points may be far away. Skylake has new
special FRONTEND_* PEBS events for this, but before it was often difficult. 

BTW if your main goal is icache; I wrote a gcc patch to help the kernel
by enabling function splitting: Apply the patch in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66890 to gcc 5,
make sure 9bebe9e5b0f (now in mainline) is applied and build with
-freorder-blocks-and-partition. That will split all functions into
statically predicted hot and cold parts and generally relieves
icache pressure. Any testing of this on your workload welcome.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ