linux-kernel - [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <564C26C4.2040603@huawei.com>
Date:	Wed, 18 Nov 2015 15:20:36 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Jiri Olsa <jolsa@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	David Ahern <dsahern@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Milian Wolff <milian.wolff@...b.com>
CC:	<linux-kernel@...r.kernel.org>, pi3orama <pi3orama@....com>,
	"lizefan 00213767" <lizefan@...wei.com>
Subject: [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling
 taken at 'callq' instruction

Hi all,

When analysising Jiri's patchset [1] I found a dwarf unwind problem.
On x86 platform, when sample is at a 'callq' instruction, dwarf based
stack unwind always fail.

I compile a small C source file with debug information, turn off
frame pointer and disable optimization:

$ gcc -g -O0 -fomit-frame-pointer ./test_dwarf_unwind.c -o 
./test_dwarf_unwind

Here is part of asm code. Please have a look at 40057d. Also, have a
look at 400534 and 4005bc, from these two instructions we know this
"funcc" need 0x38 bytes on stack, so we know where is the return address.

$ objdump -S ./test_dwarf_unwind
...
0000000000400534 <funcc>:
#include <sys/time.h>

static volatile int x = 0;

int funcc(void)
{
   400534:       48 83 ec 38             sub    $0x38,%rsp
     struct timeval tv1, tv2;
     unsigned long us1, us2;

     gettimeofday(&tv1, NULL);
   400538:       48 8d 44 24 10          lea    0x10(%rsp),%rax
   ...
   400572:       48 89 e0                mov    %rsp,%rax
   400575:       be 00 00 00 00          mov    $0x0,%esi
   40057a:       48 89 c7                mov    %rax,%rdi
   40057d:       e8 7e fe ff ff          callq  400400 
<gettimeofday@plt>    <---!!!
         us2 = tv2.tv_sec * 1000000 + tv2.tv_usec;
   400582:       48 8b 04 24             mov    (%rsp),%rax
   400586:       48 69 d0 40 42 0f 00    imul   $0xf4240,%rax,%rdx
   ...
     return x;
   4005b6:       8b 05 74 04 20 00       mov 0x200474(%rip),%eax        
# 600a30 <x>
}
   4005bc:       48 83 c4 38             add    $0x38,%rsp
   4005c0:       c3                      retq

Then use perf to trace it:

# ./pref record  -g -F 9 --call-graph dwarf ./test_dwarf_unwind
# ./perf report  --no-children --stdio
      ...
      1.23%  test_dwarf_unwi  test_dwarf_unwind  [.] funcc
             |
             ---funcc
                |
                |--0.62%-- funcb
                |          funca
                |          main
                |          __libc_start_main
                |          _start
                |
                 --0.62%-- 0   <-- I applied Jiri's patch in [1] so see 
this 0,
                                   which indicates a broken call chain.

 From 'perf script' result I found the two samples inside funcc:

# ./perf script
...
test_dwarf_unwi 25232 2201855.424651:  443365670 cycles:pp:
                   4005a4 funcc (/home/w00229757/test_dwarf_unwind)
                   4005ca funcb (/home/w00229757/test_dwarf_unwind)
                   4005d8 funca (/home/w00229757/test_dwarf_unwind)
                   4005e6 main (/home/w00229757/test_dwarf_unwind)
             7facc5c2ebd5 __libc_start_main 
(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
                   400459 _start (/home/w00229757/test_dwarf_unwind)
...
test_dwarf_unwi 25232 2201856.647088:  443268987 cycles:pp:
                   40057d funcc (/home/w00229757/test_dwarf_unwind)
                        0 [unknown] ([unknown])

With perf report -D I can manually check their stack information manually.


This is stack of sample at 4005a4:

# ./perf report -D
...
0x1d9c8 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
.  0000:  09 00 00 00 02 40 e8 20 a4 05 40 00 00 00 00 00  .....@. ..@.....
...
.  00d0:  00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00  ......... ......
.  00e0:  2b f6 4c 56 00 00 00 00 46 52 02 00 00 00 00 00 +.LV....FR......
.  00f0:  29 f6 4c 56 00 00 00 00 70 e4 0c 00 00 00 00 00 ).LV....p.......
.  0100:  06 f3 3b d7 d7 24 05 00 b0 00 28 d7 d7 24 05 00 ..;..$....(..$..
.  0110:  00 00 00 00 00 00 00 00 ca 05 40 00 00 00 00 00
...
.  20e0:  b0 06 00 00 00 00 00 00                          ........
.
2201855424651081 0x1d9c8 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002): 
25232/25232: 0x4005a4 period: 443365670 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.......
.... IP    0x4005a4
.......
... ustack: size 1712, offset 0xd8
....

 From the above information we can check its stack from 0xe0 (stack size
reside at 0xd8 as a u64 so stack data start at 0xe0). Based on information
we get from the above asm code, we know the return address should at the
0x38 of the stack, which is 0x4005ca.

This is stack of sample at 40057d:

0x343c0 [0x20e8]: event: 9
.
. ... raw event: size 8424 bytes
.  0000:  09 00 00 00 02 40 e8 20 7d 05 40 00 00 00 00 00  .....@. }.@.....
...
.  00d0:  00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 00  ......... ......
.  00e0:  82 05 40 00 00 00 00 00 2c f6 4c 56 00 00 00 00 ..@.....,.LV....
.  00f0:  84 b7 05 00 00 00 00 00 29 f6 4c 56 00 00 00 00 ........).LV....
.  0100:  70 e4 0c 00 00 00 00 00 84 9a 4e d7 d7 24 05 00 p.........N..$..
.  0110:  b0 00 28 d7 d7 24 05 00 00 00 00 00 00 00 00 00 ..(..$..........
.  0120:  ca 05 40 00 00 00 00 00 f0 05 40 00 00 00 00 00 ..@.......@.....
...
.  20e0:  b8 06 00 00 00 00 00 00                          ........
.
2201856647088808 0x343c0 [0x20e8]: PERF_RECORD_SAMPLE(IP, 0x4002): 
25232/25232: 0x40057d period: 443268987 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
......
.... IP    0x40057d
......
... ustack: size 1720, offset 0xd8
.....

By manually checking user stack from 0xe0, we can see problem that: 
although the IP
is still at 'callq', that instruction seems already take effect that, it has
already pushes its return address (0x400582) onto stack.

So there must be somewhere the pt_regs or the stack is adjusted in kernel.
Does anyone can give some hint on it?

Thank you.

[1] 
http://lkml.kernel.org/r/1447772739-18471-1-git-send-email-jolsa@kernel.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/