linux-kernel - [PATCH v2] scripts: add script for translating stack dump function offsets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160916212656.bnmxrl3cglsbjmpm@treble>
Date:   Fri, 16 Sep 2016 16:26:56 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Stephane Eranian <eranian@...gle.com>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        LKML <linux-kernel@...r.kernel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Ingo Molnar <mingo@...nel.org>,
        Kees Cook <keescook@...omium.org>
Subject: [PATCH v2] scripts: add script for translating stack dump function
 offsets

On Fri, Sep 16, 2016 at 12:26:31PM -0700, Linus Torvalds wrote:
> On Fri, Sep 16, 2016 at 12:17 PM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >
> > I think that issue is solved by addr2line's '--inline' option, which the
> > script uses:
> 
> Oh, well, even better. I clearly don't know addr2line well enough, and
> having a script that does this correctly automatically is clearly what
> *I* need too.
> 
> >>                 So both the function offset
> >> filtering and the type filtering could definitely make a difference.
> >
> > Yeah, good ideas.  That would help reduce some of the false duplicates,
> > though they are quite rare.  I'll see what I can do.
> 
> Yeah, in practice the false duplicates almost never happen. We do have
> duplicate function names, but they tend to be for simple things.
> 
> And the call trace often makes it obvious which particular function it
> is for the human that is reading the output, but since it should be
> easy to cut down on the potential duplicates, I think it's a good
> thing to do.

Ok, how about this.  If this looks ok, would you be willing to apply it?

---

From: Josh Poimboeuf <jpoimboe@...hat.com>
Subject: [PATCH v2] scripts: add script for translating stack dump function
 offsets

addr2line doesn't work with KASLR addresses.  Add a basic addr2line
wrapper script which takes the 'func+offset/size' format as input.

Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
---
v2:
- add size and function type checking
- use readelf for more deterministic output

 scripts/faddr2line | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)
 create mode 100755 scripts/faddr2line

diff --git a/scripts/faddr2line b/scripts/faddr2line
new file mode 100755
index 0000000..a837815
--- /dev/null
+++ b/scripts/faddr2line
@@ -0,0 +1,96 @@
+#!/bin/bash
+#
+# Translate stack dump function offsets.
+#
+# addr2line doesn't work with KASLR addresses.  This works similarly to
+# addr2line, but instead takes the 'func+0x123' format as input:
+#
+#   $ ./scripts/faddr2line vmlinux meminfo_proc_show+0x5/0x568
+#   fs/proc/meminfo.c:27
+#
+# If the address is part of an inlined function, the full inline call chain is
+# printed:
+#
+#   $ ./scripts/faddr2line vmlinux native_write_msr+0x6/0x27
+#   arch/x86/include/asm/msr.h:121
+#   include/linux/jump_label.h:125
+#   arch/x86/include/asm/msr.h:125
+#
+# The function size after the '/' in the input is optional, but recommended.
+# It's used to help disambiguate any duplicate symbol names, which can occur
+# rarely.  If the size is omitted for a duplicate symbol then it's possible for
+# multiple code sites to be printed:
+#
+#   $ ./scripts/faddr2line vmlinux raw_ioctl+0x5
+#   drivers/char/raw.c:122
+#   net/ipv4/raw.c:876
+
+set -o errexit
+set -o nounset
+
+usage() {
+	echo "usage: faddr2line <object file> <func+offset>" >&2
+	exit 1
+}
+
+die() {
+	echo "ERROR: $1" >&2
+	exit 1
+}
+
+command -v awk >/dev/null 2>&1 || die "awk isn't installed"
+command -v readelf >/dev/null 2>&1 || die "readelf isn't installed"
+command -v addr2line >/dev/null 2>&1 || die "addr2line isn't installed"
+
+[[ $# != 2 ]] && usage
+
+objfile=$1
+[[ ! -f $objfile ]] && die "can't find objfile $objfile"
+
+func_addr=$2
+func=${func_addr%+*}
+offset=${func_addr#*+}
+offset=${offset%/*}
+size=
+[[ $func_addr =~ "/" ]] && size=${func_addr#*/}
+
+if [[ -z $func ]] || [[ -z $offset ]] || [[ $func = $func_addr ]]; then
+	die "bad func+offset $func_addr"
+fi
+
+# Go through each of the object's symbols which match the func name.
+# In rare cases there might be duplicates.
+while read symbol; do
+	fields=($symbol)
+	sym_base=0x${fields[1]}
+	sym_size=${fields[2]}
+	sym_type=${fields[3]}
+
+	# calculate the address
+	addr=$(($sym_base + $offset))
+	if [[ -z $addr ]] || [[ $addr = 0 ]]; then
+		die "bad address: $sym_base + $offset"
+	fi
+	hexaddr=0x$(printf %x $addr)
+
+	# weed out non-function symbols
+	if [[ $sym_type != "FUNC" ]]; then
+		echo "skipping $func address at $hexaddr due to non-function symbol"
+		continue
+	fi
+
+	# if the user provided a size, make sure it matches the symbol's size
+	if [[ -n $size ]] && [[ $size -ne $sym_size ]]; then
+		echo "skipping $func address at $hexaddr due to size mismatch ($size != $sym_size)"
+		continue;
+	fi
+
+	# make sure the provided offset is within the symbol's range
+	if [[ $offset -gt $sym_size ]]; then
+		echo "skipping $func address at $hexaddr due to size mismatch ($offset <= $sym_size)"
+		continue
+	fi
+
+	addr2line -ie $objfile $hexaddr
+
+done < <(readelf -s $objfile | awk -v f=$func '$8 == f {print}')
-- 
2.7.4