linux-kernel - Re: tty: panic in tty_ldisc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170302192731.GA19622@kroah.com>
Date:   Thu, 2 Mar 2017 20:27:31 +0100
From:   Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Jiri Slaby <jslaby@...e.com>, LKML <linux-kernel@...r.kernel.org>,
        Peter Hurley <peter@...leysoftware.com>,
        One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
        syzkaller <syzkaller@...glegroups.com>
Subject: Re: tty: panic in tty_ldisc_restore

On Thu, Mar 02, 2017 at 07:27:35PM +0100, Dmitry Vyukov wrote:
> On Tue, Feb 28, 2017 at 7:11 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> > On Fri, Feb 17, 2017 at 10:51 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> >>>>> >> >> Hello,
> >>>>> >> >>
> >>>>> >> >> Syzkaller fuzzer started crashing kernel with the following panics:
> >>>>> >> >>
> >>>>> >> >> Kernel panic - not syncing: Couldn't open N_TTY ldisc for ircomm0 --- error -12.
> >>>>> >> >> CPU: 0 PID: 5637 Comm: syz-executor3 Not tainted 4.9.0 #6
> >>>>> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine,
> >>>>> >> >> BIOS Google 01/01/2011
> >>>>> >> >>  ffff8801d4ba7a18 ffffffff8234d0df ffffffff00000000 1ffff1003a974ed6
> >>>>> >> >>  ffffed003a974ece 0000000041b58ab3 ffffffff84b38180 ffffffff8234cdf1
> >>>>> >> >>  0000000000000000 0000000000000000 ffff8801d4ba76a8 00000000dabb4fad
> >>>>> >> >> Call Trace:
> >>>>> >> >>  [<ffffffff8234d0df>] __dump_stack lib/dump_stack.c:15 [inline]
> >>>>> >> >>  [<ffffffff8234d0df>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
> >>>>> >> >>  [<ffffffff818280d4>] panic+0x1fb/0x412 kernel/panic.c:179
> >>>>> >> >>  [<ffffffff826bb0d4>] tty_ldisc_restore drivers/tty/tty_ldisc.c:520 [inline]
> >>>>> >> >>  [<ffffffff826bb0d4>] tty_set_ldisc+0x704/0x8b0 drivers/tty/tty_ldisc.c:579
> >>>>> >> >>  [<ffffffff826a3a93>] tiocsetd drivers/tty/tty_io.c:2667 [inline]
> >>>>> >> >>  [<ffffffff826a3a93>] tty_ioctl+0xc63/0x2370 drivers/tty/tty_io.c:2924
> >>>>> >> >>  [<ffffffff81a7a22f>] vfs_ioctl fs/ioctl.c:43 [inline]
> >>>>> >> >>  [<ffffffff81a7a22f>] do_vfs_ioctl+0x1bf/0x1630 fs/ioctl.c:679
> >>>>> >> >>  [<ffffffff81a7b72f>] SYSC_ioctl fs/ioctl.c:694 [inline]
> >>>>> >> >>  [<ffffffff81a7b72f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
> >>>>> >> >>  [<ffffffff84377941>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> >>>>> >> >>
> >>>>> >> >> Kernel panic - not syncing: Couldn't open N_TTY ldisc for ptm2 --- error -12.
> >>>>> >> >> CPU: 0 PID: 7844 Comm: syz-executor0 Not tainted 4.9.0 #6
> >>>>> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine,
> >>>>> >> >> BIOS Google 01/01/2011
> >>>>> >> >>  ffff8801c3307a18 ffffffff8234d0df ffffffff00000000 1ffff10038660ed6
> >>>>> >> >>  ffffed0038660ece 0000000041b58ab3 ffffffff84b38180 ffffffff8234cdf1
> >>>>> >> >>  0000000000000000 0000000000000000 ffff8801c33076a8 00000000dabb4fad
> >>>>> >> >> Call Trace:
> >>>>> >> >>  [<ffffffff8234d0df>] __dump_stack lib/dump_stack.c:15 [inline]
> >>>>> >> >>  [<ffffffff8234d0df>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
> >>>>> >> >>  [<ffffffff818280d4>] panic+0x1fb/0x412 kernel/panic.c:179
> >>>>> >> >>  [<ffffffff826bb0d4>] tty_ldisc_restore drivers/tty/tty_ldisc.c:520 [inline]
> >>>>> >> >>  [<ffffffff826bb0d4>] tty_set_ldisc+0x704/0x8b0 drivers/tty/tty_ldisc.c:579
> >>>>> >> >>  [<ffffffff826a3a93>] tiocsetd drivers/tty/tty_io.c:2667 [inline]
> >>>>> >> >>  [<ffffffff826a3a93>] tty_ioctl+0xc63/0x2370 drivers/tty/tty_io.c:2924
> >>>>> >> >>  [<ffffffff81a7a22f>] vfs_ioctl fs/ioctl.c:43 [inline]
> >>>>> >> >>  [<ffffffff81a7a22f>] do_vfs_ioctl+0x1bf/0x1630 fs/ioctl.c:679
> >>>>> >> >>  [<ffffffff81a7b72f>] SYSC_ioctl fs/ioctl.c:694 [inline]
> >>>>> >> >>  [<ffffffff81a7b72f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
> >>>>> >> >>  [<ffffffff84377941>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> >>>>> >> >>
> >>>>> >> >>
> >>>>> >> >> In all cases there is a vmalloc failure right before that:
> >>>>> >> >>
> >>>>> >> >> syz-executor4: vmalloc: allocation failure, allocated 0 of 16384
> >>>>> >> >> bytes, mode:0x14000c2(GFP_KERNEL|__GFP_HIGHMEM), nodemask=(null)
> >>>>> >> >> syz-executor4 cpuset=/ mems_allowed=0
> >>>>> >> >> CPU: 1 PID: 4852 Comm: syz-executor4 Not tainted 4.9.0 #6
> >>>>> >> >> Hardware name: Google Google Compute Engine/Google Compute Engine,
> >>>>> >> >> BIOS Google 01/01/2011
> >>>>> >> >>  ffff8801c41df898 ffffffff8234d0df ffffffff00000001 1ffff1003883bea6
> >>>>> >> >>  ffffed003883be9e 0000000041b58ab3 ffffffff84b38180 ffffffff8234cdf1
> >>>>> >> >>  0000000000000282 ffffffff84fd53c0 ffff8801dae65b38 ffff8801c41df4d0
> >>>>> >> >> Call Trace:
> >>>>> >> >>  [<     inline     >] __dump_stack lib/dump_stack.c:15
> >>>>> >> >>  [<ffffffff8234d0df>] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
> >>>>> >> >>  [<ffffffff8186530f>] warn_alloc+0x21f/0x360
> >>>>> >> >>  [<ffffffff819792c9>] __vmalloc_node_range+0x4e9/0x770
> >>>>> >> >>  [<     inline     >] __vmalloc_node mm/vmalloc.c:1749
> >>>>> >> >>  [<     inline     >] __vmalloc_node_flags mm/vmalloc.c:1763
> >>>>> >> >>  [<ffffffff8197961b>] vmalloc+0x5b/0x70 mm/vmalloc.c:1778
> >>>>> >> >>  [<ffffffff826ad77b>] n_tty_open+0x1b/0x470 drivers/tty/n_tty.c:1883
> >>>>> >> >>  [<ffffffff826ba973>] tty_ldisc_open.isra.3+0x73/0xd0
> >>>>> >> >> drivers/tty/tty_ldisc.c:463
> >>>>> >> >>  [<     inline     >] tty_ldisc_restore drivers/tty/tty_ldisc.c:510
> >>>>> >> >>  [<ffffffff826bafb4>] tty_set_ldisc+0x5e4/0x8b0 drivers/tty/tty_ldisc.c:579
> >>>>> >> >>  [<     inline     >] tiocsetd drivers/tty/tty_io.c:2667
> >>>>> >> >>  [<ffffffff826a3a93>] tty_ioctl+0xc63/0x2370 drivers/tty/tty_io.c:2924
> >>>>> >> >>  [<ffffffff81a7a22f>] do_vfs_ioctl+0x1bf/0x1630
> >>>>> >> >>  [<     inline     >] SYSC_ioctl fs/ioctl.c:698
> >>>>> >> >>  [<ffffffff81a7b72f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
> >>>>> >> >>  [<ffffffff84377941>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> >>>>> >> >> arch/x86/entry/entry_64.S:204
> >>>>> >> >>
> >>>>> >> >>
> >>>>> >> >> I've found that it's even documented in the source code, but it does
> >>>>> >> >> not look like a good failure mode for allocation failure:
> >>>>> >> >>
> >>>>> >> >> static int n_tty_open(struct tty_struct *tty)
> >>>>> >> >> {
> >>>>> >> >>         struct n_tty_data *ldata;
> >>>>> >> >>
> >>>>> >> >>         /* Currently a malloc failure here can panic */
> >>>>> >> >>         ldata = vmalloc(sizeof(*ldata));
> >>>>> >> >
> >>>>> >> > How are you running out of vmalloc() memory?
> >>>>> >>
> >>>>> >>
> >>>>> >> I don't know exactly. But it does not seem to represent a problem for
> >>>>> >> the fuzzer.
> >>>>> >> Is it meant to be very hard to do?
> >>>>> >
> >>>>> > Yes, do you know of any normal way to cause it to fail?
> >>>>>
> >>>>>
> >>>>> I don't. But I means approximately nothing.
> >>>>> Do you mean that it is not possible to trigger?
> >>>>> Doesn't simply creating lots of kernel resources (files, sockets,
> >>>>> pipe) will do the trick? Or just paging in lots of memory? Even if the
> >>>>> process itself will be chosen as OOM kill target, it will still take
> >>>>> the machine down with itself due to the panic while returning from the
> >>>>> syscall, no?
> >>>>
> >>>> I'm not saying that it's impossible, just an "almost" impossible thing
> >>>> to hit.  Obviously you have hit it, so it can happen :)
> >>>>
> >>>> But, how to fix it?  I really don't know.  Unwinding a failure at this
> >>>> point in time is very tough, as that comment shows.  Any suggestions of
> >>>> how it could be resolved are greatly appreciated.
> >>>
> >>> Is it possible to not shutdown the old discipline tty_set_ldisc before
> >>> we prepare everything for the new one:
> >>>
> >>>   /* Shutdown the old discipline. */
> >>>   tty_ldisc_close(tty, old_ldisc);
> >>>
> >>> Currently it does:
> >>>
> >>>   close(old)
> >>>   if (open(new))
> >>>     open(old) // assume never fails
> >>>
> >>> it looks inherently problematic.
> >>> Couldn't we do:
> >>>
> >>>   if (open(new))
> >>>     return -ESOMETHING
> >>>   close(old)
> >>>
> >>> ?
> >>
> >>
> >> Or can we just kill the task? Still better than kernel panic.
> >
> > I guess we can't get away with killing the task as tty will be left in
> > inconsistent state and it is accessible to other tasks.
> > But what creating new ldisk first and then, if that succeeds,
> > destroying the old one?
> 
> 
> This is hurting us badly.

Really?  How?  Are you hitting this a lot?  Why now and never before?
Are you really out of memory?

> Opening new disk before closing the old one turned out to be hard (too
> much state saved in tty).
> How about this one? It reuses the existing tty_ldisc_reinit helper. If
> opening the old disk and N_TTY fails, it leaves ldisk == NULL. But
> it's already possible in tty_ldisc_hangup, and the code seems to be
> prepared for this.

<snip>

I'll look at this after -rc1 is out, thanks.

greg k-h