[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070414102528.GB20376@fi.muni.cz>
Date: Sat, 14 Apr 2007 12:25:28 +0200
From: Jan Kasprzak <kas@...muni.cz>
To: Jiri Slaby <jirislaby@...il.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: MOXA: mxser_new lockup
Jiri Slaby wrote:
: I went through the locking again just now, but I can't see anything
: wrong. Could you please enable CONFIG_PROVE_LOCKING and CONFIG_DEBUG_LOCKDEP and
: retest?
:
: And also post stty -a -F /dev/ttyMIXX or a setup passed to
: tcsetattr/ioctl(TCSETA/S) if available. What do you use for communication?
I use cu(1) from the uucp package. This MOXA card is used as a
serial console for 8 other linux boxes (mgetty is on the other side).
I have recompiled the kernel with debugging options you have
mentioned, connected to the remote serial console using "cu -l ttyMI0 -s 38400",
logged to the remote system, and ran "find / -print" to generate some
serial traffic. It went for a minute or so, but as soon as I have pressed
enter on my keyboard (generating traffic in an opposite direction),
I've got the recursive locking message (I have tried it twice, two
dumps are attached).
Looking at the code, maybe the &port->slock in mxser_interrupt()
should be dropped before calling mxser_receive_chars() (and probably
also befor calling other mxser_* functions too). A proof-of-concept
patch attached - I am not able to reproduce the lockup with this patch.
Another approach would be to use in_interrupt() all over the code to
determine whether the lock should be re-aquired or not.
-Yenya
[ now the two "recursive locking" dumps, and the patch itself ]
=============================================
[ INFO: possible recursive locking detected ]
2.6.21-rc6 #8
---------------------------------------------
cu/1793 is trying to acquire lock:
(&info->slock){++..}, at: [<ffffffff88003afd>] mxser_flush_chars+0x4d/0x83 [mxser_new]
but task is already holding lock:
(&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]
other info that might help us debug this:
2 locks held by cu/1793:
#0: (&tty->atomic_write_lock){--..}, at: [<ffffffff802252dd>] tty_write+0x9e/0x211
#1: (&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]
stack backtrace:
Call Trace:
<IRQ> [<ffffffff8028c6e4>] __lock_acquire+0x155/0xbe1
[<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
[<ffffffff8028d1eb>] lock_acquire+0x7b/0x9f
[<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
[<ffffffff8025947c>] _spin_lock_irqsave+0x2d/0x3e
[<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
[<ffffffff80212a9b>] n_tty_receive_buf+0xd26/0xddc
[<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
[<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
[<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
[<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
[<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
[<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
[<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
[<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
[<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
[<ffffffff80253aa6>] ret_from_intr+0x0/0xf
<EOI> [<ffffffff80259406>] _spin_unlock_irqrestore+0x40/0x44
[<ffffffff802175dc>] write_chan+0x2e0/0x2ff
[<ffffffff80275d26>] default_wake_function+0x0/0xe
[<ffffffff802253b6>] tty_write+0x177/0x211
[<ffffffff802172fc>] write_chan+0x0/0x2ff
[<ffffffff802142a3>] vfs_write+0xce/0x157
[<ffffffff80214baa>] sys_write+0x45/0x6e
[<ffffffff8025355e>] system_call+0x7e/0x83
BUG: spinlock lockup on CPU#1, cu/1793, ffffffff880082a8
Call Trace:
<IRQ> [<ffffffff80207632>] _raw_spin_lock+0xcc/0xf3
[<ffffffff80259484>] _spin_lock_irqsave+0x35/0x3e
[<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
[<ffffffff80212a9b>] n_tty_receive_buf+0xd26/0xddc
[<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
[<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
[<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
[<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
[<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
[<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
[<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
[<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
[<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
[<ffffffff80253aa6>] ret_from_intr+0x0/0xf
<EOI> [<ffffffff80259406>] _spin_unlock_irqrestore+0x40/0x44
[<ffffffff802175dc>] write_chan+0x2e0/0x2ff
[<ffffffff80275d26>] default_wake_function+0x0/0xe
[<ffffffff802253b6>] tty_write+0x177/0x211
[<ffffffff802172fc>] write_chan+0x0/0x2ff
[<ffffffff802142a3>] vfs_write+0xce/0x157
[<ffffffff80214baa>] sys_write+0x45/0x6e
[<ffffffff8025355e>] system_call+0x7e/0x83
-----------------------------------------------------------------------
Another one:
=============================================
[ INFO: possible recursive locking detected ]
����ӡ���2.6.21-rc6 #8
---------------------------------------------
swapper/0 is trying to acquire lock:
(&info->slock){++..}, at: [<ffffffff88003c6d>] mxser_stop+0x1c/0x43 [mxser_new]
but task is already holding lock:
(&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]
other info that might help us debug this:
1 lock held by swapper/0:
#0: (&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]
stack backtrace:
Call Trace:
<IRQ> [<ffffffff8028c6e4>] __lock_acquire+0x155/0xbe1
[<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
[<ffffffff8028d1eb>] lock_acquire+0x7b/0x9f
[<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
[<ffffffff8025947c>] _spin_lock_irqsave+0x2d/0x3e
[<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
[<ffffffff8021207a>] n_tty_receive_buf+0x305/0xddc
[<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
[<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
[<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
[<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
[<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
[<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
[<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
[<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
[<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
[<ffffffff8025f6ae>] default_idle+0x35/0x51
[<ffffffff8025f679>] default_idle+0x0/0x51
[<ffffffff80253aa6>] ret_from_intr+0x0/0xf
<EOI> [<ffffffff8025f679>] default_idle+0x0/0x51
[<ffffffff8025f6ae>] default_idle+0x35/0x51
[<ffffffff8025f6b0>] default_idle+0x37/0x51
[<ffffffff8025f6ae>] default_idle+0x35/0x51
[<ffffffff802420fc>] cpu_idle+0x57/0x76
BUG: spinlock lockup on CPU#1, swapper/0, ffffffff880082a8
Call Trace:
<IRQ> [<ffffffff80207632>] _raw_spin_lock+0xcc/0xf3
[<ffffffff80259484>] _spin_lock_irqsave+0x35/0x3e
[<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
[<ffffffff8021207a>] n_tty_receive_buf+0x305/0xddc
[<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
[<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
[<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
[<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
[<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
[<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
[<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
[<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
[<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
[<ffffffff8025f6ae>] default_idle+0x35/0x51
[<ffffffff8025f679>] default_idle+0x0/0x51
[<ffffffff80253aa6>] ret_from_intr+0x0/0xf
<EOI> [<ffffffff8025f679>] default_idle+0x0/0x51
[<ffffffff8025f6ae>] default_idle+0x35/0x51
[<ffffffff8025f6b0>] default_idle+0x37/0x51
[...]
-----------------------------------------------------------------------
Signed-off-by: Jan "Yenya" Kasprzak <kas@...muni.cz>
--- linux-2.6.21-rc6/drivers/char/mxser_new.c.orig 2007-04-14 11:50:54.000000000 +0200
+++ linux-2.6.21-rc6/drivers/char/mxser_new.c 2007-04-14 12:14:28.000000000 +0200
@@ -2352,27 +2352,43 @@
if (iir == MOXA_MUST_IIR_GDA ||
iir == MOXA_MUST_IIR_RDA ||
iir == MOXA_MUST_IIR_RTO ||
- iir == MOXA_MUST_IIR_LSR)
+ iir == MOXA_MUST_IIR_LSR) {
+ spin_unlock(&port->slock);
mxser_receive_chars(port,
&status);
+ spin_lock(&port->slock);
+ }
} else {
status &= port->read_status_mask;
- if (status & UART_LSR_DR)
+ if (status & UART_LSR_DR) {
+ spin_unlock(&port->slock);
mxser_receive_chars(port,
&status);
+ spin_lock(&port->slock);
+ }
}
msr = inb(port->ioaddr + UART_MSR);
- if (msr & UART_MSR_ANY_DELTA)
+ if (msr & UART_MSR_ANY_DELTA) {
+ spin_unlock(&port->slock);
mxser_check_modem_status(port, msr);
+ spin_lock(&port->slock);
+ }
if (port->board->chip_flag) {
if (iir == 0x02 && (status &
- UART_LSR_THRE))
+ UART_LSR_THRE)) {
+ spin_unlock(&port->slock);
+
mxser_transmit_chars(port);
+ spin_lock(&port->slock);
+ }
} else {
- if (status & UART_LSR_THRE)
+ if (status & UART_LSR_THRE) {
+ spin_unlock(&port->slock);
mxser_transmit_chars(port);
+ spin_lock(&port->slock);
+ }
}
} while (int_cnt++ < MXSER_ISR_PASS_LIMIT);
spin_unlock(&port->slock);
--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ |
> I will never go to meetings again because I think face to face meetings <
> are the biggest waste of time you can ever have. --Linus Torvalds <
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists