lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <4549240.M84I420bcG@milian-kdab2>
Date:   Mon, 11 Sep 2017 17:14:57 +0200
From:   Milian Wolff <milian.wolff@...b.com>
To:     linux-perf-users@...r.kernel.org
Cc:     Arnaldo Carvalho <acme@...nel.org>, linux-kernel@...r.kernel.org
Subject: perf AppImage - perf stat fails to count child process

Hey all,

because many people only have a pretty outdated perf userspace toolsuite 
available, I would like to package a recent version of perf into an AppImage. 
My current .travis.yml can be found at [1], the result is available from [2], 
e.g. [3].

[1]: https://github.com/milianw/linux/blob/wip/perf-appimage/.travis.yml
[2]: https://travis-ci.org/milianw/linux/
[3]: https://transfer.sh/IX3zr/perf-git.5b330557-x86_64.AppImage

This was build on a Ubuntu 14.04 VM using Travis CI. All libs that got bundled 
are those available on that system, except for elfutils - for that I used a 
custom build of the newer 0.170 release.

So far so good, this actually seems to work for most purposes I've tried so 
far. What's missing is proper `man` integration, I'll try to look into this. 
But one issue that I've found which is puzzling me is the following:

perf stat does not work when filtering on an application. I.e. when I use the 
system wide mode, it works fine:

~~~~~
$ ./perf-git.3a73b7f9-x86_64.AppImage stat
^C
 Performance counter stats for 'system wide':

       4510.197963      cpu-clock (msec)          #    3.998 CPUs utilized          
               205      context-switches          #    0.045 K/sec                  
                 3      cpu-migrations            #    0.001 K/sec                  
                 0      page-faults               #    0.000 K/sec                  
        16,396,597      cycles                    #    0.004 GHz                    
        13,065,398      stalled-cycles-frontend   #   79.68% frontend cycles 
idle   
        11,340,112      stalled-cycles-backend    #   69.16% backend cycles 
idle    
         5,718,881      instructions              #    0.35  insn per cycle         
                                                  #    2.28  stalled cycles 
per insn
         1,197,336      branches                  #    0.265 M/sec                  
           101,136      branch-misses             #    8.45% of all branches        

       1.128056669 seconds time elapsed
~~~~~

But when I launch an application most counters are empty:

~~~~~
$ ./perf-git.3a73b7f9-x86_64.AppImage stat sleep 1

 Performance counter stats for 'sleep 1':

     <not counted>      task-clock                                                  
     <not counted>      context-switches                                            
     <not counted>      cpu-migrations                                              
     <not counted>      page-faults                                                 
     <not counted>      cycles                                                      
     <not counted>      stalled-cycles-frontend                                     
     <not counted>      stalled-cycles-backend                                      
            68,696      instructions                                                                                   
            13,573      branches                  #    0.000 K/sec                  
             1,410      branch-misses             #   10.39% of all branches        

       0.000023100 seconds time elapsed
~~~~~

Also note how this did _not_ sleep for one second but rather quit sooner than 
that?!

Stracing (as root, due to AppImage), shows the following issue:

~~~~~
$ sudo strace -f -e fork,execve -s 1000 ./perf-git.3a73b7f9-x86_64.AppImage 
stat sleep 1
execve("./perf-git.3a73b7f9-x86_64.AppImage", ["./perf-git.3a73b7f9-
x86_64.AppImage", "stat", "sleep", "1"], 0x7ffea0ec02d0 /* 18 vars */) = 0
strace: Process 21593 attached
strace: Process 21595 attached
[pid 21593] +++ exited with 0 +++
[pid 21592] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21593, 
si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
strace: Process 21596 attached
[pid 21592] execve("/tmp/.mount_perf-ggA4YJW/AppRun", ["./perf-git.3a73b7f9-
x86_64.AppImage", "stat", "sleep", "1"], 0x1ce1b70 /* 21 vars */) = 0
strace: Process 21597 attached
[pid 21597] execve("/usr/libexec/perf-core/sleep", ["sleep", "1"], 0x3075650 /
* 22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/tmp/./sleep", ["sleep", "1"], 0x3075650 /* 22 vars */) = 
-1 ENOENT (No such file or directory)
[pid 21597] execve("/home/milian/.bin/sleep", ["sleep", "1"], 0x3075650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/usr/lib/icecream/libexec/icecc/bin/sleep", ["sleep", 
"1"], 0x3075650 /* 22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/ssd2/milian/projects/compiled/other/bin/sleep", ["sleep", 
"1"], 0x3075650 /* 22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/home/milian/.bin/kf5/sleep", ["sleep", "1"], 0x3075650 /* 
22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/ssd2/milian/projects/compiled/kf5/bin/sleep", ["sleep", 
"1"], 0x3075650 /* 22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/home/milian/projects/compiled/other/bin/sleep", ["sleep", 
"1"], 0x3075650 /* 22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/home/milian/projects/compiled/kf5/bin/sleep", ["sleep", 
"1"], 0x3075650 /* 22 vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/usr/local/sbin/sleep", ["sleep", "1"], 0x3075650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/usr/local/bin/sleep", ["sleep", "1"], 0x3075650 /* 22 
vars */) = -1 ENOENT (No such file or directory)
[pid 21597] execve("/usr/bin/sleep", ["sleep", "1"], 0x3075650 /* 22 vars */
 Performance counter stats for 'sleep 1':

     <not counted>      task-clock                                                  
     <not counted>      context-switches                                            
     <not counted> ) = 0
     cpu-migrations                                              
     <not counted>      page-faults                                                 
     <not counted>      cycles                                                      
     <not counted>      instructions                                                
     <not counted>      branches                                                    
     <not counted>      branch-misses                                               

       0.000033074 seconds time elapsed

Some events weren't counted. Try disabling the NMI watchdog:
        echo 0 > /proc/sys/kernel/nmi_watchdog
        perf stat ...
        echo 1 > /proc/sys/kernel/nmi_watchdog
[pid 21597] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=21592, 
si_uid=0} ---
[pid 21597] +++ killed by SIGTERM +++
[pid 21592] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=21597, 
si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} ---
[pid 21596] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=21595, 
si_uid=0} ---
[pid 21592] +++ exited with 0 +++
[pid 21595] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=21595, 
si_uid=0} ---
[pid 21596] +++ exited with 0 +++
+++ exited with 0 +++
~~~~~

I couldn't yet find a way to get a backtrace to the SIGTERM. I also wonder why 
I don't see any error message anywhere...

To make this even more confusing, extracting to AppImage (it's essentially a 
squashfs image that gets loop-mounted via fuse), things start to work 
again...:

~~~~~
$ ./perf-git.3a73b7f9-x86_64.AppImage --appimage-extract
..
$ ./squashfs-root/AppRun stat sleep 1

 Performance counter stats for 'sleep 1':

          0.559792      task-clock (msec)         #    0.001 CPUs utilized          
                 1      context-switches          #    0.002 M/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
                60      page-faults               #    0.107 M/sec                  
           922,471      cycles                    #    1.648 GHz                    
           564,261      stalled-cycles-frontend   #   61.17% frontend cycles 
idle   
           441,173      stalled-cycles-backend    #   47.83% backend cycles 
idle    
           743,424      instructions              #    0.81  insn per cycle         
                                                  #    0.76  stalled cycles 
per insn
           145,166      branches                  #  259.321 M/sec                  
             7,726      branch-misses             #    5.32% of all branches        

       1.001073810 seconds time elapsed
~~~~~~

So clearly, it has something to do with the FUSE mounting / AppImage... But 
has anyone a clue what could be going on here exactly? Any suggestions for 
what to try in order to debug this?

Thanks
-- 
Milian Wolff | milian.wolff@...b.com | Senior Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3826 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ