linux-kernel - [F.A.Q.] perf ABI backwards and forwards compatibility

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111108102235.GA1241@elte.hu>
Date:	Tue, 8 Nov 2011 11:22:35 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Ted Ts'o <tytso@....edu>, Pekka Enberg <penberg@...helsinki.fi>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Vince Weaver <vince@...ter.net>,
	Pekka Enberg <penberg@...nel.org>,
	Anthony Liguori <anthony@...emonkey.ws>,
	Avi Kivity <avi@...hat.com>,
	"kvm@...r.kernel.org list" <kvm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>,
	qemu-devel Developers <qemu-devel@...gnu.org>,
	Alexander Graf <agraf@...e.de>,
	Blue Swirl <blauwirbel@...il.com>,
	Américo Wang <xiyou.wangcong@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: [F.A.Q.] perf ABI backwards and forwards compatibility


* Ted Ts'o <tytso@....edu> wrote:

> I don't believe there's ever been any guarantee that "perf test" 
> from version N of the kernel will always work on a version N+M of 
> the kernel.  Perhaps I am wrong, though. If that is a guarantee 
> that the perf developers are willing to stand behind, or have 
> already made, I would love to be corrected and would be delighted 
> to hear that in fact there is a stable, backwards compatible perf 
> ABI.

We do even more than that, the perf ABI is fully backwards *and* 
forwards compatible: you can run older perf on newer ABIs and newer 
perf on older ABIs.

To show you how it works in practice, here's a random 
cross-compatibility experiment: going back to the perf ABI of 2 years 
ago. I used v2.6.32 which was just the second upstream kernel with 
perf released in it.

So i took a fresh perf tool version and booted a vanilla v2.6.32 
(x86, defconfig, PERF_COUNTERS=y) kernel:

  $ uname -a
  Linux mercury 2.6.32 #162137 SMP Tue Nov 8 10:55:37 CET 2011 x86_64 x86_64 x86_64 GNU/Linux

  $ perf --version
  perf version 3.1.1927.gceec2

  $ perf top

  Events: 2K cycles
 61.68%  [kernel]             [k] sha_transform
 16.09%  [kernel]             [k] mix_pool_bytes_extract
  4.70%  [kernel]             [k] extract_buf
  4.17%  [kernel]             [k] _spin_lock_irqsave
  1.44%  [kernel]             [k] copy_user_generic_string
  0.75%  [kernel]             [k] extract_entropy_user
  0.37%  [kernel]             [k] acpi_pm_read

[the box is running a /dev/urandom stress-test as you can see.]

 $ perf stat sleep 1

 Performance counter stats for 'sleep 1':

          0.766698 task-clock                #    0.001 CPUs utilized          
                 1 context-switches          #    0.001 M/sec                  
                 0 CPU-migrations            #    0.000 M/sec                  
               177 page-faults               #    0.231 M/sec                  
         1,513,332 cycles                    #    1.974 GHz                    
   <not supported> stalled-cycles-frontend 
   <not supported> stalled-cycles-backend  
           522,609 instructions              #    0.35  insns per cycle        
            65,812 branches                  #   85.838 M/sec                  
             7,762 branch-misses             #   11.79% of all branches        

       1.076211168 seconds time elapsed

The two <not supported> events are not supported by the old kernel - 
but the other events were and the tool picked them up without bailing 
out.

Regular profiling:

 $ perf record -a sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.075 MB perf.data (~3279 samples) ]

perf report output:

 $ perf report

 Events: 1K cycles
  64.45%          dd  [kernel.kallsyms]    [k] sha_transform
  19.39%          dd  [kernel.kallsyms]    [k] mix_pool_bytes_extract
   4.11%          dd  [kernel.kallsyms]    [k] _spin_lock_irqsave
   2.98%          dd  [kernel.kallsyms]    [k] extract_buf
   0.84%          dd  [kernel.kallsyms]    [k] copy_user_generic_string
   0.38%         ssh  libcrypto.so.0.9.8b  [.] lh_insert
   0.28%   flush-8:0  [kernel.kallsyms]    [k] block_write_full_page_endio
   0.28%   flush-8:0  [kernel.kallsyms]    [k] generic_make_request

These examples show *PICTURE PERFECT* backwards ABI compatibility, 
when using the bleeding perf tool on an ancient perf kernel (when it 
wasnt even called 'perf events' but 'perf counters').

[ Note, i didnt go back to v2.6.31, the oldest upstream perf kernel, 
  because it's such a pain to build with recent binutils and recent 
  GCC ... v2.6.32 already needed a workaround and a couple of .config 
  tweaks to build and boot at all. ]

Then i built the ancient v2.6.32 perf tool from 2 years ago:

 $ perf --version
 perf version 0.0.2.PERF

and booted a fresh v3.1+ kernel:

 $ uname -a
 Linux mercury 3.1.0-tip+ #162138 SMP Tue Nov 8 11:14:26 CET 2011 x86_64 x86_64 x86_64 GNU/Linux

 $ perf stat ls

 Performance counter stats for 'ls':

       1.739193  task-clock-msecs         #      0.069 CPUs 
              0  context-switches         #      0.000 M/sec
              0  CPU-migrations           #      0.000 M/sec
            250  page-faults              #      0.144 M/sec
        3477562  cycles                   #   1999.526 M/sec
        1661460  instructions             #      0.478 IPC  
         839826  cache-references         #    482.883 M/sec
          15742  cache-misses             #      9.051 M/sec

    0.025231139  seconds time elapsed

 $ perf top

 ------------------------------------------------------------------------------
   PerfTop:   38916 irqs/sec  kernel:99.6% [100000 cycles],  (all, 2 CPUs)
 ------------------------------------------------------------------------------

             samples    pcnt   kernel function
             _______   _____   _______________

            41191.00 - 53.1% : sha_transform
            20818.00 - 26.8% : mix_pool_bytes_extract
             5481.00 -  7.1% : _raw_spin_lock_irqsave
             2132.00 -  2.7% : extract_buf
             1788.00 -  2.3% : copy_user_generic_string
              801.00 -  1.0% : acpi_pm_read
              446.00 -  0.6% : _raw_spin_unlock_irqrestore
              284.00 -  0.4% : __memset
              259.00 -  0.3% : extract_entropy_user

 $ perf record -a -f sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.034 MB perf.data (~1467 samples) ]

 $ perf report

 # Samples: 1023
 #
 # Overhead        Command                     Shared Object  Symbol
 # ........  .............  ................................  ......
 #
     4.50%        swapper  [kernel]                          [k] acpi_pm_read
     4.01%        swapper  [kernel]                          [k] delay_tsc
     2.05%           sudo  /lib64/libcrypto.so.0.9.8b        [.] 0x000000000a0549
     1.96%           perf  [kernel]                          [k] vsnprintf
     1.86%        swapper  [kernel]                          [k] test_clear_page_writeback
     1.66%           perf  [kernel]                          [k] format_decode
     1.56%           sudo  /lib64/ld-2.7.so                  [.] do_lookup_x

These examples show *PICTURE PERFECT* forwards ABI compatibility, 
using the ancient perf tool on a bleeding edge kernel.

During the years we migrated across various transformations of the 
subsystem and added tons of features, while maintaining the perf ABI.

I don't know where the whole ABI argument comes from - perf has 
argumably one of the best and most compatible tooling ABIs within 
Linux. I suspect back in the original perf flamewars people made up 
their mind prematurely that it 'cannot' possibly work and never 
changed their mind about it, regardless of reality proving them
wrong ;-)

And yes, the quality of the ABI and tooling cross-compatibility is 
not accidental at all, it is fully intentional and we take great care 
that it stays so. More than that we'll gladly take more 'perf test' 
testcases, for obscure corner-cases that other tools might rely on. 
I.e. we are willing to help external tooling to get their testcases 
built into the kernel repo.

Note that such level of ABI support is arguably clearly overkill for 
instrumentation: which by its very nature tends to migrate to the 
newer versions - still we maintain it because in our opinion good, 
usable tooling should have a good, extensible ABI.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/