linux-kernel - Re: BUG: unable to handle kernel paging request from pty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56CF8124.4080003@hurleysoftware.com>
Date:	Thu, 25 Feb 2016 14:33:08 -0800
From:	Peter Hurley <peter@...leysoftware.com>
To:	Jiri Slaby <jslaby@...e.cz>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Greg KH <gregkh@...uxfoundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	stable <stable@...r.kernel.org>, lwn@....net,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: BUG: unable to handle kernel paging request from pty_write [was:
 Linux 4.4.2]

On 02/25/2016 01:32 PM, Jiri Slaby wrote:
> On 02/25/2016, 09:51 PM, Linus Torvalds wrote:
>> Jiri, can you check your try_to_wake_up() disassembly for some
>> indirect "jmp" instructions?
> 
> Nope, there is none.
> 
> I will reply to all your questions tomorrow.
> 
> Just quickly, as I have to go (and don't want you to duplicate efforts)
> the kernel which was used can be obtained here:
> https://build.opensuse.org/package/binaries/openSUSE:Factory:Staging:I/kernel-default?repository=standard
> 
> The issue is very weird, indeed, this is what I noted to our bugzilla:
> The stack trace ends in call of try_to_wake_up. Then, there it has to be
> some of the indirect calls:
> 
> callq  *0x40(%rax)
>   p->sched_class->select_task_rq from select_task_rq
> 
> RAX is 0x00000000bb37e180, barely can be read with offset 0x40
> 
> callq  *0xd85656(%rip) # ffffffff81e2aba0 <smp_ops+0x20>
>   smp_ops.smp_send_reschedule from ttwu_queue_remote
> 
> Which hardly can be it, given smp_ops is static.
> 
> So it has to be some other "call *" from a nested function :(.
> 
> 
> 
> 
> Interestingly, RBP contains address inside try_to_wake_up --
> ffffffff810a535a (dunno why) which is:
> ffffffff810a5355:       e8 66 a0 ff ff          callq  ffffffff8109f3c0
> <ttwu_stat>
> ffffffff810a535a:       e9 9d fe ff ff          jmpq   ffffffff810a51fc
> <try_to_wake_up+0x3c>

That would imply that RSP was off by +8 when the ttwu_stat() epilog was
executed so that RBP <= ret addr and RIP <= some local var in try_to_wake_up()
stack frame.

Looks like R15 in the crash report could be what RBP should have been.

Now to find out why RSP is +8


> 
> 
> ttwu_stat does in the begginning:
> mov    $0x16e80,%r14
> 
> which is what we actually still have in r14 when it crashes. The first
> ttwu_stat's "if" has to go through the true branch (otherwise r14 would
> be overwritten).
> 
> 
> 
> Another note: we die when jmp/calling to 0xffff88023fd40000.
> RSI=RDI=0xffff88023fdd6e80. RSI-RIP is 0x96e80, which is R14 + 0x80000.
> Coincidence?
> 
> thanks,
>