lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 21 Oct 2018 00:39:51 +0200
From:   Milian Wolff <mail@...ianw.de>
To:     linux-kernel@...r.kernel.org, Jiri Olsa <jolsa@...nel.org>,
        namhyung@...nel.org
Cc:     linux-perf-users@...r.kernel.org,
        Arnaldo Carvalho <acme@...nel.org>
Subject: Broken dwarf unwinding - wrong stack pointer register value?

Hey all,

I'm on the quest to figure out why perf regularly fails to unwind (some) 
samples. I am seeing very strange behavior, where an apparently wrong stack 
pointer value is read from the register - see below for more information and 
the end of this (long) mail for my open questions. Any help would be greatly 
appreciated.

I am currently using this trivial C++ code to reproduce the issue:

```
#include <cmath>
#include <complex>
#include <iostream>
#include <random>

using namespace std;

int main()
{
    uniform_real_distribution<double> uniform(-1E5, 1E5);
    default_random_engine engine;
    double s = 0;
    for (int i = 0; i < 10000000; ++i) {
        s += norm(complex<double>(uniform(engine), uniform(engine)));
    }
    cout << s << '\n';
    return 0;
}
```

I compile it with `g++ -O2 -g` and then record it with `perf record --call-
graph dwarf`. Using perf script, I then see e.g.:

```
$ perf script -v --no-inline --time 90229.12668,90229.127158 --ns
...
# first frame (working unwinding from __hypot_finite):
unwind: reg 16, val 7faf7dca2696
unwind: reg 7, val 7ffc80811ca0
unwind: find_proc_info dso /usr/lib/libm-2.28.so
unwind: access_mem addr 0x7ffc80811ce8 val 7faf7dc88af9, offset 72
unwind: find_proc_info dso /usr/lib/libm-2.28.so
unwind: access_mem addr 0x7ffc80811d08 val 56382b0fc129, offset 104
unwind: find_proc_info dso /home/milian/projects/kdab/rnd/hotspot/build/tests/
test-clients/cpp-inlining/cpp-inlining
unwind: access_mem addr 0x7ffc80811d58 val 7faf7dabf223, offset 184
unwind: find_proc_info dso /usr/lib/libc-2.28.so
unwind: access_mem addr 0x7ffc80811e18 val 56382b0fc1ee, offset 376
unwind: find_proc_info dso /home/milian/projects/kdab/rnd/hotspot/build/tests/
test-clients/cpp-inlining/cpp-inlining
unwind: __hypot_finite:ip = 0x7faf7dca2696 (0x29696)
unwind: hypotf32x:ip = 0x7faf7dc88af8 (0xfaf8)
unwind: main:ip = 0x56382b0fc128 (0x1128)
unwind: __libc_start_main:ip = 0x7faf7dabf222 (0x24222)
unwind: _start:ip = 0x56382b0fc1ed (0x11ed)
# second frame (unrelated)
unwind: reg 16, val 56382b0fc114
unwind: reg 7, val 7ffc80811d10
unwind: find_proc_info dso /home/milian/projects/kdab/rnd/hotspot/build/tests/
test-clients/cpp-inlining/cpp-inlining
unwind: access_mem addr 0x7ffc80811d58 val 7faf7dabf223, offset 72
unwind: access_mem addr 0x7ffc80811e18 val 56382b0fc1ee, offset 264
unwind: main:ip = 0x56382b0fc114 (0x1114)
unwind: __libc_start_main:ip = 0x7faf7dabf222 (0x24222)
unwind: _start:ip = 0x56382b0fc1ed (0x11ed)
# third frame (broken unwinding from __hypot_finite)
unwind: reg 16, val 7faf7dca2688
unwind: reg 7, val 7ffc80811ca0
unwind: find_proc_info dso /usr/lib/libm-2.28.so
unwind: access_mem addr 0x7ffc80811cc0 val 0, offset 32
unwind: __hypot_finite:ip = 0x7faf7dca2688 (0x29688)
unwind: '':ip = 0xffffffffffffffff (0x0)
cpp-inlining 24617 90229.126685606:     711026 cycles:uppp: 
	    7faf7dca2696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
	    7faf7dc88af8 hypotf32x+0x18 (/usr/lib/libm-2.28.so)
	    56382b0fc128 main+0x88 (/home/milian/projects/kdab/rnd/hotspot/
build/tests/test-clients/cpp-inlining/cpp-inlining)
	    7faf7dabf222 __libc_start_main+0xf2 (/usr/lib/libc-2.28.so)
	    56382b0fc1ed _start+0x2d (/home/milian/projects/kdab/rnd/hotspot/
build/tests/test-clients/cpp-inlining/cpp-inlining)

cpp-inlining 24617 90229.126921551:     714657 cycles:uppp: 
	    56382b0fc114 main+0x74 (/home/milian/projects/kdab/rnd/hotspot/
build/tests/test-clients/cpp-inlining/cpp-inlining)
	    7faf7dabf222 __libc_start_main+0xf2 (/usr/lib/libc-2.28.so)
	    56382b0fc1ed _start+0x2d (/home/milian/projects/kdab/rnd/hotspot/
build/tests/test-clients/cpp-inlining/cpp-inlining)

cpp-inlining 24617 90229.127157818:     719976 cycles:uppp: 
	    7faf7dca2688 __hypot_finite+0x28 (/usr/lib/libm-2.28.so)
	ffffffffffffffff [unknown] ([unknown])
...
```

Now I'm trying to figure out why one __hypot_finite sample works but the other 
one breaks for no apparent reason. One thing I noticed so far is the following 
discrepancy in memory accesses, i.e. for the working sample we get:

```
unwind: reg 16, val 7faf7dca2696
unwind: reg 7, val 7ffc80811ca0
unwind: access_mem addr 0x7ffc80811ce8 val 7faf7dc88af9, offset 72
...
```

But the broken sample only produces:

```
unwind: reg 16, val 7faf7dca2688
unwind: reg 7, val 7ffc80811ca0
unwind: find_proc_info dso /usr/lib/libm-2.28.so
unwind: access_mem addr 0x7ffc80811cc0 val 0, offset 32
```

The user stack of the working sample starts with (in hex):

0000000000000000000000009c830441000000000000000000000000000000000000000000000000254a980000000000301d8180fc7f0000c0c10f2b3856000030

The user stack of the broken sample starts with (in hex):

0000000000000000000000002f64fd4000000000000000000000000000000000000000000000000073bb970000000000301d8180fc7f0000c0c10f2b3856000030

This is a 64bit machine, so word width is 8. So both user stacks start with 
`24 * 2 = 48` words of zeros. Hence offset 32 cannot possibly produce a 
meaningful value...

This offset is calculcated from LIBUNWIND__ARCH_REG_SP, cf. unwind-libunwind-
local.c. So is the stack pointer address in the register wrong? If I hackishly 
offset the value for the stack pointer by 40, i.e.:

```
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/
unwind-libunwind-local.c
index 79f521a552cf..a766ddaaa4dd 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -502,6 +502,7 @@ static int access_mem(unw_addr_space_t __maybe_unused as,
 	if (ret)
 		return ret;
 
+	start -= 40;
 	end = start + stack->size;
 
 	/* Check overflow. */
```

Then I can successfully unwind the broken sample:

```
$ perf script -v --no-inline --time 90229.127156,90229.127158 --ns
...
unwind: reg 16, val 7faf7dca2688
unwind: reg 7, val 7ffc80811ca0
unwind: find_proc_info dso /usr/lib/libm-2.28.so
unwind: access_mem addr 0x7ffc80811cc0 val 7faf7dc88af9, offset 72
unwind: find_proc_info dso /usr/lib/libm-2.28.so
unwind: access_mem addr 0x7ffc80811ce0 val 56382b0fc129, offset 104
unwind: find_proc_info dso /home/milian/projects/kdab/rnd/hotspot/build/tests/
test-clients/cpp-inlining/cpp-inlining
unwind: access_mem addr 0x7ffc80811d30 val 7faf7dabf223, offset 184
unwind: find_proc_info dso /usr/lib/libc-2.28.so
unwind: access_mem addr 0x7ffc80811df0 val 56382b0fc1ee, offset 376
unwind: find_proc_info dso /home/milian/projects/kdab/rnd/hotspot/build/tests/
test-clients/cpp-inlining/cpp-inlining
unwind: __hypot_finite:ip = 0x7faf7dca2688 (0x29688)
unwind: hypotf32x:ip = 0x7faf7dc88af8 (0xfaf8)
unwind: main:ip = 0x56382b0fc128 (0x1128)
unwind: __libc_start_main:ip = 0x7faf7dabf222 (0x24222)
unwind: _start:ip = 0x56382b0fc1ed (0x11ed)
cpp-inlining 24617 90229.127157818:     719976 cycles:uppp: 
	    7faf7dca2688 __hypot_finite+0x28 (/usr/lib/libm-2.28.so)
	    7faf7dc88af8 hypotf32x+0x18 (/usr/lib/libm-2.28.so)
	    56382b0fc128 main+0x88 (/home/milian/projects/kdab/rnd/hotspot/
build/tests/test-clients/cpp-inlining/cpp-inlining)
	    7faf7dabf222 __libc_start_main+0xf2 (/usr/lib/libc-2.28.so)
	    56382b0fc1ed _start+0x2d (/home/milian/projects/kdab/rnd/hotspot/
build/tests/test-clients/cpp-inlining/cpp-inlining)

```

So, what now? Here are my open questions:

Is this just working now by chance, or is this the real reason? I.e. is the 
register value for the stack pointer incorrectly recorded?

Can this be fixed somehow during record time?

Can we detect this scenario at analysis time and correct the stack pointer 
address automatically somehow? Should the first frame always try to read from 
offset 72 maybe?

Any help would be greatly appreciated, many thanks

-- 
Milian Wolff
mail@...ianw.de
http://milianw.de
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ