lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120320070755.GB27423@gmail.com>
Date:	Tue, 20 Mar 2012 08:07:55 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] perf events changes for v3.4


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Mon, Mar 19, 2012 at 8:53 AM, Ingo Molnar <mingo@...nel.org> wrote:
> >
> > Please pull the latest perf-core-for-linus git tree from:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-core-for-linus
> 
> This seems to be another pull request that really could have 
> done with some high-level overview of what the changes are for 
> 3.4

Yeah, you are right, will do that next time around for the 
larger trees (or trees that are not obviously single-topic at 
first sight).

Here is a short (and incomplete) high-level summary of the perf 
events changes of the v3.4 cycle:

 - New "hardware based branch profiling" feature both on the 
   kernel and the tooling side, on CPUs that support it. (modern 
   x86 Intel CPUs with the 'LBR' hardware feature currently.)

   This new feature is basically a sophisticated 'magnifying 
   glass' for branch execution - something that is pretty 
   difficult to extract from regular, function histogram centric 
   profiles.

   The simplest mode is activated via 'perf record -b', and the 
   result looks like this in perf report:

    $ perf record -b any_call,u -e cycles:u branchy

    $ perf report -b --sort=symbol
        52.34%  [.] main                   [.] f1
        24.04%  [.] f1                     [.] f3
        23.60%  [.] f1                     [.] f2
         0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
         0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
         0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
         0.01%  [k] __printf               [k] _IO_vfprintf_internal
         0.01%  [k] main                   [k] __printf

   This output shows from/to branch columns and shows the 
   highest percentage (from,to) jump combinations - i.e. the 
   most likely taken branches in the system. "branches" can also 
   include function calls and any other synchronous and 
   asynchronous transitions of the instruction pointer that are 
   not 'next instruction' - such as system calls, traps, 
   interrupts, etc.

   This feature comes with (hopefully intuitive) flat ascii and 
   TUI support in perf report.

 - Various 'perf annotate' visual improvements for us assembly
   junkies. It will now recognize function calls in the TUI and
   by hitting enter you can follow the call (recursively) and 
   back, amongst other improvements.

 - Multiple threads/processes recording support in perf record,
   perf stat, perf top - which is activated via a comma-list of 
   PIDs:

     perf top -p 21483,21485
     perf stat -p 21483,21485 -ddd
     perf record -p 21483,21485

 - Support for per UID views, via the --uid paramter to perf 
   top, perf report, etc. For example 'perf top --uid mingo' 
   will only show the tasks that I am running, excluding other 
   users, root, etc.

 - Jump label restructurings and improvements - this
   includes the factoring out of the (hopefully much clearer)
   include/linux/static_key.h generic facility:

	struct static_key key = STATIC_KEY_INIT_FALSE;

	...

        if (static_key_false(&key))
                do unlikely code
        else
                do likely code

	...
	static_key_slow_inc();
	...
	static_key_slow_inc();
	...

   The static_key_false() branch will be generated into the code 
   with as little impact to the likely code path as possible.  
   the static_key_slow_*() APIs flip the branch via live kernel 
   code patching.

   This facility can now be used more widely within the
   kernel to micro-optimize hot branches whose likelihood 
   matches the static-key usage and fast/slow cost patterns.

 - SW function tracer improvements: perf support and filtering
   support.

 - Various hardenings of the perf.data ABI, to make older 
   perf.data's smoother on newer tool versions, to make new 
   features integrate more smoothly, to support cross-endian
   recording/analyzing workflows better, etc.

 - Restructuring of the kprobes code, the splitting out of 
   'optprobes', and a corner case bugfix.

 - Allow the tracing of kernel console output (printk).

 - Improvements/fixes to user-space RDPMC support, allowing 
   user-space self-profiling code to extract PMU counts without
   performing any system calls, while playing nice with the 
   kernel side.

 - 'perf bench' improvements

 - ... and lots of internal restructurings, cleanups and fixes 
   that made these features possible. And, as usual this list is 
   incomplete as there were also lots of other
   improvements:

     165 files changed, 6107 insertions(+), 1984 deletions(-)

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ