linux-kernel - Re: tty crash in tty_ldisc_receive

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAL_JsqKe7SdO+Bg219v02Jve86zGs=f-2R0AHnhCtOKjb_bB3A@mail.gmail.com>
Date:   Thu, 6 Apr 2017 08:28:17 -0500
From:   Rob Herring <robh@...nel.org>
To:     Michael Neuling <mikey@...ling.org>
Cc:     Al Viro <viro@...iv.linux.org.uk>, johan Hovold <johan@...nel.org>,
        Peter Hurley <peter@...leysoftware.com>,
        Wang YanQing <udknight@...il.com>,
        Alexander Popov <alex.popov@...ux.com>,
        Mikulas Patocka <mpatocka@...hat.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        benh <benh@...nel.crashing.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: tty crash in tty_ldisc_receive_buf()

On Thu, Apr 6, 2017 at 2:04 AM, Michael Neuling <mikey@...ling.org> wrote:
> Hi all,
>
> We are seeing the following crash (in linux-next but has been around since at
> least v4.10).
>
> [  417.514499] Unable to handle kernel paging request for data at address 0x00002260
> [  417.515361] Faulting instruction address: 0xc0000000006fad80
> cpu 0x15: Vector: 300 (Data Access) at [c00000799411f890]
>     pc: c0000000006fad80: n_tty_receive_buf_common+0xc0/0xbd0
>     lr: c0000000006fad5c: n_tty_receive_buf_common+0x9c/0xbd0
>     sp: c00000799411fb10
>    msr: 900000000280b033
>    dar: 2260
>  dsisr: 40000000
>   current = 0xc0000079675d1e00
>   paca    = 0xc00000000fb0d200   softe: 0        irq_happened: 0x01
>     pid   = 5, comm = kworker/u56:0
> Linux version 4.11.0-rc5-next-20170405 (mikey@...86) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #2 SMP Thu Apr 6 00:36:46 CDT 2017
> enter ? for help
> [c00000799411fbe0] c0000000006ff968 tty_ldisc_receive_buf+0x48/0xe0
> [c00000799411fc10] c0000000007009d8 tty_port_default_receive_buf+0x68/0xe0
> [c00000799411fc50] c0000000006ffce4 flush_to_ldisc+0x114/0x130
> [c00000799411fca0] c00000000010a0fc process_one_work+0x1ec/0x580
> [c00000799411fd30] c00000000010a528 worker_thread+0x98/0x5d0
> [c00000799411fdc0] c00000000011343c kthread+0x16c/0x1b0
> [c00000799411fe30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74
>
> It seems the null ptr deref is in n_tty_receive_buf_common() where we do:
>
>                 size_t tail = smp_load_acquire(&ldata->read_tail);
>
> ldata is NULL.
>
> We see this usually on boot but can also see it if we kill a getty attached to
> tty (which is then respawned by systemd).  It seems like we are flushing data to
> a tty at the same time as it's being torn down and restarted.
>
> I did try the below patch which avoids the crash but locks up one of the CPUs. I
> guess the data never gets flushed if we say nothing is processed.
>
> This is on powerpc but has also been reported by parisc.
>
> I'm not at all familiar with the tty layer and looking at the locks, mutexes,
> semaphores and reference counting in there scares the hell out of me.
>
> If anyone has an idea, I'm happy to try a patch.

Can you try this one [1].

Rob

[1] https://lkml.org/lkml/2017/3/23/569