linux-kernel - Re: Suggestions on how to debug kernel crashes where printk and gdb both does not work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210614172512.799db10d@gmail.com>
Date:   Mon, 14 Jun 2021 17:25:12 +0300
From:   Pavel Skripkin <paskripkin@...il.com>
To:     Dongliang Mu <mudongliangabcd@...il.com>
Cc:     alex.aring@...il.com, "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-wpan@...r.kernel.org, netdev@...r.kernel.org,
        stefan@...enfreihafen.org,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        syzbot+b80c9959009a9325cdff@...kaller.appspotmail.com,
        Dan Carpenter <dan.carpenter@...cle.com>,
        Greg KH <gregkh@...uxfoundation.org>
Subject: Re: Suggestions on how to debug kernel crashes where printk and gdb
 both does not work

On Mon, 14 Jun 2021 22:19:10 +0800
Dongliang Mu <mudongliangabcd@...il.com> wrote:

> On Mon, Jun 14, 2021 at 9:34 PM Pavel Skripkin <paskripkin@...il.com>
> wrote:
> >
> > On Mon, 14 Jun 2021 21:22:43 +0800
> > Dongliang Mu <mudongliangabcd@...il.com> wrote:
> >
> > > Dear kernel developers,
> > >
> > > I was trying to debug the crash - memory leak in hwsim_add_one [1]
> > > recently. However, I encountered a disgusting issue: my
> > > breakpoint and printk/pr_alert in the functions that will be
> > > surely executed do not work. The stack trace is in the following.
> > > I wrote this email to ask for some suggestions on how to debug
> > > such cases?
> > >
> > > Thanks very much. Looking forward to your reply.
> > >
> >
> > Hi, Dongliang!
> >
> > This bug is not similar to others on the dashboard. I spent some
> > time debugging it a week ago. The main problem here, that memory
> > allocation happens in the boot time:
> >
> > > [<ffffffff84359255>] kernel_init+0xc/0x1a7 init/main.c:1447
> >
> 
> Oh, nice catch. No wonder why my debugging does not work. :(
> 
> > and reproducer simply tries to
> > free this data. You can use ftrace to look at it. Smth like this:
> >
> > $ echo 'hwsim_*' > $TRACE_DIR/set_ftrace_filter
> 
> Thanks for your suggestion.
> 
> Do you have any conclusions about this case? If you have found out the
> root cause and start writing patches, I will turn my focus to other
> cases.

No, I had some busy days and I have nothing about this bug for now.
I've just traced the reproducer execution and that's all :)

I guess, some error handling paths are broken, but Im not sure 
 

> 
> BTW, I only found another possible memory leak after some manual code
> review [1]. However, it is not the root cause for this crash.
> 
> [1] https://lkml.org/lkml/2021/6/10/1297
> 
> >
> > would work.
> >
> >
> > With regards,
> > Pavel Skripkin




With regards,
Pavel Skripkin