[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56CF8124.4080003@hurleysoftware.com>
Date: Thu, 25 Feb 2016 14:33:08 -0800
From: Peter Hurley <peter@...leysoftware.com>
To: Jiri Slaby <jslaby@...e.cz>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Greg KH <gregkh@...uxfoundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
stable <stable@...r.kernel.org>, lwn@....net,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: BUG: unable to handle kernel paging request from pty_write [was:
Linux 4.4.2]
On 02/25/2016 01:32 PM, Jiri Slaby wrote:
> On 02/25/2016, 09:51 PM, Linus Torvalds wrote:
>> Jiri, can you check your try_to_wake_up() disassembly for some
>> indirect "jmp" instructions?
>
> Nope, there is none.
>
> I will reply to all your questions tomorrow.
>
> Just quickly, as I have to go (and don't want you to duplicate efforts)
> the kernel which was used can be obtained here:
> https://build.opensuse.org/package/binaries/openSUSE:Factory:Staging:I/kernel-default?repository=standard
>
> The issue is very weird, indeed, this is what I noted to our bugzilla:
> The stack trace ends in call of try_to_wake_up. Then, there it has to be
> some of the indirect calls:
>
> callq *0x40(%rax)
> p->sched_class->select_task_rq from select_task_rq
>
> RAX is 0x00000000bb37e180, barely can be read with offset 0x40
>
> callq *0xd85656(%rip) # ffffffff81e2aba0 <smp_ops+0x20>
> smp_ops.smp_send_reschedule from ttwu_queue_remote
>
> Which hardly can be it, given smp_ops is static.
>
> So it has to be some other "call *" from a nested function :(.
>
>
>
>
> Interestingly, RBP contains address inside try_to_wake_up --
> ffffffff810a535a (dunno why) which is:
> ffffffff810a5355: e8 66 a0 ff ff callq ffffffff8109f3c0
> <ttwu_stat>
> ffffffff810a535a: e9 9d fe ff ff jmpq ffffffff810a51fc
> <try_to_wake_up+0x3c>
That would imply that RSP was off by +8 when the ttwu_stat() epilog was
executed so that RBP <= ret addr and RIP <= some local var in try_to_wake_up()
stack frame.
Looks like R15 in the crash report could be what RBP should have been.
Now to find out why RSP is +8
>
>
> ttwu_stat does in the begginning:
> mov $0x16e80,%r14
>
> which is what we actually still have in r14 when it crashes. The first
> ttwu_stat's "if" has to go through the true branch (otherwise r14 would
> be overwritten).
>
>
>
> Another note: we die when jmp/calling to 0xffff88023fd40000.
> RSI=RDI=0xffff88023fdd6e80. RSI-RIP is 0x96e80, which is R14 + 0x80000.
> Coincidence?
>
> thanks,
>
Powered by blists - more mailing lists