lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 16 Jun 2009 01:45:36 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Alan Cox <alan@...rguk.ukuu.org.uk>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>
Subject: bug in tty ldisc and friends


[ Ingo and Andrew, sorry for the repost, but I forgot to Cc LKML and I 
  also used Alan's old email. ]

I can easily trigger a bug in the tty code by enabling the event tracers.

Attached is the config.

One dump I get is:

RIP: 0010:[<ffffffff812591b4>]  [<ffffffff812591b4>] 
pty_chars_in_buffer+0x24/0x70
RSP: 0018:ffff880025f83938  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88003b986000 RCX: 0000000000000000
RDX: ffffffff81259190 RSI: 0000000000000001 RDI: ffff880030da5000
RBP: ffff880025f83948 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffff88003389ae40 R14: ffff88003b9fe400 R15: 0000000000000000
FS:  00007f97e5c6a710(0000) GS:ffff880004000000(0000) 
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000003d195000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
Process sshd (pid: 3378, threadinfo ffff880025f82000, task 
ffff880025f80840)
Stack:
 ffff88003b986000 ffff88003b986000 ffff880025f83958 ffffffff8100bebd
 ffff880025f83988 ffffffff8100bebd ffff88003389ae40 ffff88003b986000
 0000000000000000 ffff88003b9fe440 ffff880025f839c8 ffffffff8100bebd
Call Trace:
 [<ffffffff8100bebd>] return_to_handler+0x0/0x33
 [<ffffffff812557ea>] tty_chars_in_buffer+0x1a/0x20
 [<ffffffff8100bebd>] return_to_handler+0x0/0x33
 [<ffffffff81252a20>] n_tty_poll+0x170/0x1a0
 [<ffffffff8100bebd>] return_to_handler+0x0/0x33
 [<ffffffff8124e3ea>] tty_poll+0x8a/0xa0
 [<ffffffff8100bebd>] return_to_handler+0x0/0x33
 [<ffffffff811126aa>] do_select+0x3ba/0x6d0
 [<ffffffff811122f0>] ? do_select+0x0/0x6d0
 [<ffffffff81113070>] ? __pollwait+0x0/0x100
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff81113170>] ? pollwake+0x0/0x50
 [<ffffffff810a8615>] ? ring_buffer_lock_reserve+0x105/0x1e0
 [<ffffffff810aca30>] ? trace_buffer_lock_reserve+0x50/0x70
 [<ffffffff810ad588>] ? trace_nowake_buffer_unlock_commit+0x58/0x70
 [<ffffffff81077e4e>] ? put_lock_stats+0xe/0x30
 [<ffffffff81077f22>] ? lock_release_holdtime+0xb2/0x160
 [<ffffffff81112bfa>] core_sys_select+0x23a/0x390
 [<ffffffff81112a0d>] ? core_sys_select+0x4d/0x390
 [<ffffffff81077e4e>] ? put_lock_stats+0xe/0x30
 [<ffffffff81067ef0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff813bd642>] ? _spin_unlock_irqrestore+0x42/0x80
 [<ffffffff81112fa7>] sys_select+0x47/0x110
 [<ffffffff813bcfb7>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8100c0db>] system_call_fastpath+0x16/0x1b
Code: 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 e8 32 2c db ff 48 89 
fb 48 8b bf 20 02 00 00 48 85 ff 74 3b 48 8b 87 b8 00 00 00 <48> 8b 00 48 
8b 40 30 48 85 c0 74 28 ff d0 48 8b 53 08 66 83 ba 


I don't know enough about the tty layer to debug it, but looking at the 
asm, I found I hit a bug NULL pointer here:

static int pty_chars_in_buffer(struct tty_struct *tty)
{
        struct tty_struct *to = tty->link;
        int count;

        /* We should get the line discipline lock for "tty->link" */
        if (!to || !to->ldisc->ops->chars_in_buffer)
                return 0;


The to->ldisc is NULL.

I can trigger other weird crashes too that all seem to happen when ldisc 
is being modified.

Here's what I did to get some good output:

# echo 1 > /proc/sys/kernel/ftrace_dump_on_oops
# echo no_context:traceoff no_context '*tty*' '*pty*' '*ldisc*' > /debug/tracing/set_ftrace_filter

The 'no_context:traceoff' will stop tracing when we hit the no_context 
function in the fault handler. This prevents us from losing too much data 
from the trace.

# echo function > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/events/enable

Then I just ssh to the box, and exit over and over, and usually in about 2 
to 5 tries it crashes. If you have the box hooked up to a serial port, 
you'll see the trace up to the crash.

If you want to add more functions to trace (tracing all fuctions seems to 
be too much and does not trigger the race) just append to the file:

# echo new_func >> /debug/tracing/set_ftrace_filter

also, adding ftrace_dump_on_opps to the command line sets it 
automatically.

You can scatter trace_printk()'s in the code if you want more info too.

Here's a trace dump I got:

   udevd-3583    3...1 8218205us : lock_acquire: read &namespace_sem
    sshd-3579    0d..2 8218205us : lock_release: tty_ldisc_lock
    sshd-3584    2...1 8218205us : tty_ldisc_try_get <-tty_ldisc_get
   udevd-3583    3d..2 8218206us : lock_acquire: &sem->wait_lock
   udevd-3583    3d..2 8218206us : lock_acquired: &sem->wait_lock (0.000 us)
    sshd-3584    2...2 8218206us : kmalloc: call_site=ffffffff81256f90 ptr=ffff880028082b60 bytes_req=16 bytes_alloc=32 gfp_flags=GFP_KERNEL
    sshd-3579    0.... 8218207us : n_tty_poll <-tty_poll
   udevd-3583    3d..2 8218207us : lock_release: &sem->wait_lock
    sshd-3584    2d..3 8218207us : lock_acquire: tty_ldisc_lock
    sshd-3579    0.... 8218208us : tty_hung_up_p <-n_tty_poll
   udevd-3583    3d..1 8218208us : lock_acquired: &namespace_sem (0.000 us)
    sshd-3579    0.... 8218209us : tty_chars_in_buffer <-n_tty_poll
    sshd-3584    2d..3 8218209us : lock_acquired: tty_ldisc_lock (0.000 us)
   udevd-3583    3...1 8218209us : lock_release: &namespace_sem
   udevd-3583    3d..2 8218209us : lock_acquire: &sem->wait_lock
    sshd-3579    0.... 8218209us : pty_chars_in_buffer <-tty_chars_in_buffer
    sshd-3584    2d..3 8218210us : lock_release: tty_ldisc_lock
   udevd-3583    3d..2 8218211us : lock_acquired: &sem->wait_lock (0.000 us)
   udevd-3583    3d..2 8218211us : lock_release: &sem->wait_lock
    sshd-3584    2...1 8218211us : tty_set_termios_ldisc <-tty_ldisc_reinit
   udevd-3583    3d..1 8218212us : lock_release: &p->lock
    sshd-3579    0...3 8218212us : lock_acquire: read rcu_read_lock
    sshd-3584    2...3 8218212us : lock_acquire: &tty->termios_mutex
    sshd-3579    0...3 8218213us : lock_release: rcu_read_lock
    sshd-3584    2d..3 8218214us : lock_acquired: &tty->termios_mutex (0.000 us)
   udevd-3583    3...2 8218214us : lock_acquire: &newf->file_lock
    sshd-3579    0d..2 8218214us : lock_acquire: &sem->wait_lock
   udevd-3583    3d..2 8218215us : lock_acquired: &newf->file_lock (0.000 us)
   udevd-3583    3...2 8218215us : lock_release: &newf->file_lock
    sshd-3584    2d..2 8218215us : lock_release: &tty->termios_mutex
    sshd-3579    0d..2 8218216us : lock_acquired: &sem->wait_lock (0.000 us)
    sshd-3579    0d..2 8218217us : lock_release: &sem->wait_lock
    sshd-3584    2...1 8218217us : tty_ldisc_open <-tty_ldisc_hangup
   udevd-3583    3...1 8218217us : kfree: call_site=ffffffff8111e33d ptr=ffff88003b868000
    sshd-3579    0...1 8218218us : lock_acquire: try read &mm->mmap_sem
    sshd-3584    2...1 8218218us : n_tty_open <-tty_ldisc_open
   udevd-3583    3...1 8218218us : kfree: call_site=ffffffff8111e345 ptr=ffff88003cf7b8c0
    sshd-3584    2...1 8218220us : n_tty_open: tty=ffff88000a839000 read_buf=(null)
   udevd-3583    3...2 8218220us : lock_acquire: files_lock
    sshd-3579    0...1 8218221us : lock_release: &mm->mmap_sem
   udevd-3583    3d..2 8218221us : lock_acquired: files_lock (0.000 us)
   udevd-3583    3...2 8218222us : lock_release: files_lock
    sshd-3579    0d..2 8218222us : lock_acquire: &sem->wait_lock
    sshd-3584    2...2 8218222us : kmalloc: call_site=ffffffff8125566f ptr=ffff88000a83d000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL|GFP_ZERO
    sshd-3579    0d..2 8218223us : lock_acquired: &sem->wait_lock (0.000 us)
    sshd-3584    2...1 8218224us : n_tty_open: tty=ffff88000a839000 read_buf=ffff88000a83d000
   udevd-3583    3...2 8218224us : lock_acquire: dcache_lock
    sshd-3579    0d..2 8218224us : lock_release: &sem->wait_lock
   udevd-3583    3d..2 8218225us : lock_acquired: dcache_lock (0.000 us)
   udevd-3583    3...3 8218225us : lock_acquire: &dentry->d_lock
   udevd-3583    3d..3 8218226us : lock_acquired: &dentry->d_lock (0.000 us)
    sshd-3579    0.... 8218226us : no_context <-__bad_area_nosemaphore

The cryptic '3d..2' is CPU 3, 'd' means interrupts are disabled, 'N' would 
mean need resched set (but it is not because it is a '.') and the last is 
the preempt count, although the event markers may disable preemption and 
report an incorrect preempt count.

-- Steve
Download attachment "config.gz" of type "APPLICATION/OCTET-STREAM" (15305 bytes)

Powered by blists - more mailing lists