[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220118175239.lqxi2ycgeusk5pxl@treble>
Date: Tue, 18 Jan 2022 09:52:39 -0800
From: Josh Poimboeuf <jpoimboe@...hat.com>
To: Kaiwan N Billimoria <kaiwan.billimoria@...il.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Chi-Thanh Hoang <chithanh.hoang@...il.com>
Subject: Re: Issue using faddr2line on kernel modules
On Tue, Jan 18, 2022 at 08:10:28AM +0530, Kaiwan N Billimoria wrote:
> Hi Josh,
>
> Actually your first patch - the one you mentioned had other issues -
> worked perfectly when applied:
>
> scripts/faddr2line ./oops_tryv2.ko do_the_work+0x16f/0x194
> do_the_work+0x16f/0x0000000000000194:
> do_the_work at <...>/oops_tryv2/oops_tryv2.c:62
>
> The second one still failed in the same manner:
>
> scripts/faddr2line ./oops_tryv2.ko do_the_work+0x16f/0x194
> bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
>
> So, is it possible to fixup issues with the first version?
> What are these issues?
The first patch basically reverts the fix in commit efdb4167e676
("scripts/faddr2line: Fix "size mismatch" error"). That would be nice
as it's simpler and more robust, but unfortunately it would cause a lot
of "size mismatch" errors with vmlinux symbols.
Can you give the output of 'nm -n ./oops_tryv2.ko'? There must be some
text symbol immediately after the do_the_work() symbol which is either
out of order, or part of another section.
Is do_the_work() in the .text section?
> On Tue, Jan 18, 2022 at 1:57 AM Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >
> > On Mon, Jan 17, 2022 at 11:48:39AM -0800, Josh Poimboeuf wrote:
> > > On Mon, Jan 17, 2022 at 10:27:14AM +0530, Kaiwan N Billimoria wrote:
> > > > Hi there,
> > > >
> > > > Am researching using the cool faddr2line script to help debug Oops'es
> > > > from kernel modules..
> > > > I find it works just fine when used against the unstripped vmlinux
> > > > with debug symbols.
> > > >
> > > > My use case is for a kernel module which Oopses, though. Here's my scenario:
> > > > I built a module on a custom debug kernel (5.10.60) with most debug
> > > > options enabled...
> > > > KASLR is enabled by default as well.
> > > >
> > > > A test kernel module Oopses on my x86_64 guest running this kernel with:
> > > > RIP: 0010:do_the_work+0x15b/0x174 [oops_tryv2]
> > > >
> > > > So, i try this:
> > > >
> > > > $ <...>/linux-5.10.60/scripts/faddr2line ./oops_tryv2.ko do_the_work+0x15b/0x174
> > > > bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
> > > > $
> > > >
> > > > (It works fine with addr2line though!).
> > > > Now I think I've traced the faddr2line script's failure to locate
> > > > anything down to this:
> > > > ...
> > > > done < <(${NM} -n $objfile | awk -v fn=$func -v end=$file_end '$3 ==
> > > > fn { found=1; line =$0; start=$1; next } found == 1 { found=0;
> > > > print line, "0x"$1 } END {if (found == 1) print line, end; }')
> > > >
> > > > The nm output is:
> > > > $ nm -n ./oops_tryv2.ko |grep -i do_the_work
> > > > 0000000000000000 t do_the_work
> > > > $
> > > >
> > > > nm shows the text addr as 0x0; this is obviously incorrect (same 0x0
> > > > with objdump -d on the module).
> > > > Am I missing something? Any suggestions as to what I can try, to get
> > > > faddr2line working?
> > >
> > > Hi Kaiwan,
> > >
> > > Thanks for reporting this issue. The module text address of 0x0 is not
> > > necessarily incorrect, as the address is relative the the module, where
> > > all text usually starts at zero.
> > >
> > > I was able to recreate this problem using a module which only has a
> > > single function in .text. Does this fix it?
> >
> > Actually, that patch has other problems. Try this one?
> >
> > ----
> >
> > From: Josh Poimboeuf <jpoimboe@...hat.com>
> > Subject: [PATCH] scripts/faddr2line: Only look for text symbols when
> > calculating function size
> >
> > With the following commit:
> >
> > efdb4167e676 ("scripts/faddr2line: Fix "size mismatch" error")
> >
> > ... it was discovered that faddr2line can't just read a function's ELF
> > size, because that wouldn't match the kallsyms function size which is
> > printed in the stack trace. The kallsyms size includes any padding
> > after the function, whereas the ELF size does not.
> >
> > So faddr2line has to manually calculate the size of a function similar
> > to how kallsyms does. It does so by starting with a sorted list of
> > symbols and subtracting the function address from the subsequent
> > symbol's address.
> >
> > That calculation is broken in the case where the function is the last
> > (or only) symbol in the .text section. The next symbol in the sorted
> > list might actually be a data symbol, which can break the function size
> > detection:
> >
> > $ scripts/faddr2line sound/soundcore.ko sound_devnode+0x5/0x35
> > bad symbol size: base: 0x0000000000000000 end: 0x0000000000000000
> >
> > Similar breakage can occur when reading from a .o file.
> >
> > Fix it by only looking for text symbols.
> >
> > Fixes: efdb4167e676 ("scripts/faddr2line: Fix "size mismatch" error")
> > Reported-by: Kaiwan N Billimoria <kaiwan.billimoria@...il.com>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
> > ---
> > scripts/faddr2line | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/scripts/faddr2line b/scripts/faddr2line
> > index 6c6439f69a72..2a130134f1e6 100755
> > --- a/scripts/faddr2line
> > +++ b/scripts/faddr2line
> > @@ -189,7 +189,7 @@ __faddr2line() {
> >
> > DONE=1
> >
> > - done < <(${NM} -n $objfile | awk -v fn=$func -v end=$file_end '$3 == fn { found=1; line=$0; start=$1; next } found == 1 { found=0; print line, "0x"$1 } END {if (found == 1) print line, end; }')
> > + done < <(${NM} -n $objfile | awk -v fn=$func -v end=$file_end '$2 !~ /[Tt]/ {next} $3 == fn { found=1; line=$0; start=$1; next } found == 1 { found=0; print line, "0x"$1 } END {if (found == 1) print line, end; }')
> > }
> >
> > [[ $# -lt 2 ]] && usage
> > --
> > 2.31.1
> >
>
--
Josh
Powered by blists - more mailing lists