linux-kernel - Re: BUG: unable to handle kernel paging request from pty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56D01331.5030401@suse.cz>
Date:	Fri, 26 Feb 2016 09:56:17 +0100
From:	Jiri Slaby <jslaby@...e.cz>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Peter Hurley <peter@...leysoftware.com>,
	Greg KH <gregkh@...uxfoundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	stable <stable@...r.kernel.org>, lwn@....net,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: BUG: unable to handle kernel paging request from pty_write [was:
 Linux 4.4.2]

On 02/26/2016, 01:38 AM, Linus Torvalds wrote:
> On Thu, Feb 25, 2016 at 1:32 PM, Jiri Slaby <jslaby@...e.cz> wrote:
>>
>> Interestingly, RBP contains address inside try_to_wake_up --
>> ffffffff810a535a (dunno why) which is:
>> ffffffff810a5355:       e8 66 a0 ff ff          callq  ffffffff8109f3c0
>> <ttwu_stat>
>> ffffffff810a535a:       e9 9d fe ff ff          jmpq   ffffffff810a51fc
>> <try_to_wake_up+0x3c>
>>
>> ttwu_stat does in the begginning:
>> mov    $0x16e80,%r14
>>
>> which is what we actually still have in r14 when it crashes. The first
>> ttwu_stat's "if" has to go through the true branch (otherwise r14 would
>> be overwritten).
> 
> Hmm. That does sound very much like it might be ttwu_stat() that has
> gotten the stack frame wrong, and when finishes exits, it does
> 
>         popq    %rbp
>         ret
> 
> but in fact it popped the return address, and then returned to a crazy address.
> 
> Which sounds like a corrupted stack pointer (not a corrupted stack).
> 
> Can you make just the "vmlinux" file available somewhere?

Sure, both vmlinux w/ its separated .debuginfo sections vmlinux.debug
are at:
http://labs.suse.cz/jslaby/bug-968218/

There is also core.s which is a result of:
objdump -d vmlinux-4.4.2-3-default | grep -A 10000 '<update_rq_clock>:'
>core.s

> In my own private configuration, ttwu_stat() doesn't actually touch
> the stack at all - no stack pointer action anywhere except for the
> 
> ttwu_stat:
> 1:      call    __fentry__
>         pushq   %rbp
>    ..
>         movq    %rsp, %rbp      #,
> 
>  .....
> 
>         popq    %rbp
>         ret
> 
> but yeah, as Peter says, maybe an exception screwed up %rsp somehow..

Lucky you. My ttwu_stat does a bit more stack save-restoring. But all
seem to be paired:

ffffffff8109f3c0 <ttwu_stat>:
ffffffff8109f3c0:       e8 fb ca 60 00          callq  ffffffff816abec0
<__fentry__>
ffffffff8109f3c5:       55                      push   %rbp
ffffffff8109f3c6:       48 89 e5                mov    %rsp,%rbp
ffffffff8109f3c9:       41 57                   push   %r15
ffffffff8109f3cb:       41 56                   push   %r14
ffffffff8109f3cd:       41 55                   push   %r13
ffffffff8109f3cf:       41 54                   push   %r12
ffffffff8109f3d1:       49 c7 c6 80 6e 01 00    mov    $0x16e80,%r14
ffffffff8109f3d8:       53                      push   %rbx
...
ffffffff8109f48c:       5b                      pop    %rbx
ffffffff8109f48d:       41 5c                   pop    %r12
ffffffff8109f48f:       41 5d                   pop    %r13
ffffffff8109f491:       41 5e                   pop    %r14
ffffffff8109f493:       41 5f                   pop    %r15
ffffffff8109f495:       5d                      pop    %rbp
ffffffff8109f496:       c3                      retq


> I really don't see how it would happen here - that code doesn't look
> particularly odd.
> 
> And the fentry code used by the function tracer can certainly screw
> things up, but even that would be hard-pressed to screw up %rbp, since
> the saving of rbp comes *after* fentry. Old pre-__fentry__ gcc
> versions had a much higher likelihood (the whole mcount thing is a
> disaster, but I'm assuming you have a compiler that does __fentry__
> and have CC_USING_FENTRY set?)

Yep, -mfentry in use obviously from the dump above, it is compiled by
gcc 5.3.1 rev231346.

thanks,
-- 
js
suse labs