[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090715165308.GA5228@skywalker>
Date: Wed, 15 Jul 2009 22:23:08 +0530
From: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
To: Alan Cox <alan@...rguk.ukuu.org.uk>
Cc: linux-kernel@...r.kernel.org
Subject: Re: tty related hangs with 2.6.31-rc3
On Wed, Jul 15, 2009 at 04:11:42PM +0100, Alan Cox wrote:
> On Wed, 15 Jul 2009 18:59:56 +0530
> "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com> wrote:
>
> > Hi,
> >
> > I am finding tty related hangs with 2.6.31-rc3. This didn't happen
> > before. This happen when i close the emacs session. The /proc/<pid>/stack
> > content is below
>
> Thanks - nice clear trace. Looks like a bug in the n_tty locking changes
> from a few releases back that the pty changes are triggering.
>
> Basically process_echoes calls tty_put_char which if it thinks a device
> queue was full and now has a bit of space will call tty_wakeup which can
> call process_echoes and thus deadlock. With a physical serial device we
> will even sometimes call tty_wakeup() from the serial transmit path
> which is an irq path (which makes this doubly wrong as it then takes
> mutexes)
>
> Emacs presumably uses fasync which is the trigger for this. You need the
> right timing combined with the new pty behaviour combined with FASYNC to
> trigger it.
>
>
> > [<c0362c5b>] process_echoes+0x2b/0x2e0
> [which tries to take the lock we already hold (end A)]
> > [<c03638cb>] n_tty_write_wakeup+0xb/0x40
> [which processes our ldisc wakeup (end A)]
> > [<c0360a88>] tty_wakeup+0x58/0x70
> [which wakes up our tty (end A)]
> > [<c0368347>] pty_write+0x67/0x70
> [our write method is for tty/pty pairs end A output, queued to end B]
> > [<c035f1cb>] tty_put_char+0x2b/0x40
> [calls tty_put_char to write the echoed byte to end A output]
> > [<c036291f>] do_output_char+0xef/0x200
> [the fake typed character is echoed back towards end B]
> > [<c0362d4e>] process_echoes+0x11e/0x2e0
> [tries to process echo characters on end A]
> > [<c0364292>] n_tty_receive_char+0x102/0x710
> [ receives a byte that we've faked typing to end A input]
> > [<c0364ac0>] n_tty_receive_buf+0x220/0x410
> [ioctl method calls the ld->ops->receive_buf for n_tty (unsafely but that
> bug is old]
> > [<c036059c>] tiocsti+0x8c/0xa0
> > [<c0361aca>] tty_ioctl+0x25a/0x310
> > [<c01e52c8>] vfs_ioctl+0x28/0x80
> > [<c01e54f4>] do_vfs_ioctl+0x64/0x1c0
> > [<c01e56a3>] sys_ioctl+0x53/0x70
> > [<c0102e3c>] sysenter_do_call+0x12/0x28
> > [<ffffffff>] 0xffffffff
>
> Do you have the lock validator enabled and if so did it have anything
> useful to report ?
>
> Please try the following. I suspect this is the real fix:
>
With limited testing i am not seeing the hang any more. I will keep running
with the patch and report back if i hit the hang again.
-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists