[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1711102228030.2288@nanos>
Date: Fri, 10 Nov 2017 22:29:59 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Fengguang Wu <fengguang.wu@...el.com>,
Network Development <netdev@...r.kernel.org>,
Linux Wireless List <linux-wireless@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [run_timer_softirq] BUG: unable to handle kernel paging request
at 0000000000010007
On Fri, 10 Nov 2017, Linus Torvalds wrote:
> On Wed, Nov 8, 2017 at 9:19 PM, Fengguang Wu <fengguang.wu@...el.com> wrote:
> >
> > Yes it's accessing the list. Here is the faddr2line output.
>
> Ok, so it's a corrupted timer list. Which is not a big surprise.
>
> It's
>
> next->pprev = pprev;
>
> in __hlist_del(), and the trapping instruction decodes as
>
> mov %rdx,0x8(%rax)
>
> with %rax having the value dead000000000200,
>
> Which is just LIST_POISON2.
>
> So we've deleted that entry twice - LIST_POISON2 is what hlist_del()
> sets pprev to after already deleting it once.
>
> Although in this case it might not be hlist_del(), because
> detach_timer() also sets entry->next to LIST_POISON2.
>
> Which is pretty bogus, we are supposed to use LIST_POISON1 for the
> "next" pointer. Oh well. Nobody cares, except for the list entry
> debugging code, which isn't run on the hlist cases.
>
> Adding Thomas Gleixner to the cc. It should not be possible to delete
> the same timer twice.
Right, it shouldn't.
Fengguang, can you please enable:
CONFIG_DEBUG_OBJECTS
CONFIG_DEBUG_OBJECTS_TIMERS
and try to reproduce? Debugobject should catch that hopefully.
Thanks,
tglx
Powered by blists - more mailing lists