lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070414102528.GB20376@fi.muni.cz>
Date:	Sat, 14 Apr 2007 12:25:28 +0200
From:	Jan Kasprzak <kas@...muni.cz>
To:	Jiri Slaby <jirislaby@...il.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: MOXA: mxser_new lockup

Jiri Slaby wrote:
: I went through the locking again just now, but I can't see anything
: wrong. Could you please enable CONFIG_PROVE_LOCKING and CONFIG_DEBUG_LOCKDEP and
: retest?
: 
: And also post stty -a -F /dev/ttyMIXX or a setup passed to
: tcsetattr/ioctl(TCSETA/S) if available. What do you use for communication?

	I use cu(1) from the uucp package. This MOXA card is used as a
serial console for 8 other linux boxes (mgetty is on the other side).

	I have recompiled the kernel with debugging options you have
mentioned, connected to the remote serial console using "cu -l ttyMI0 -s 38400",
logged to the remote system, and ran "find / -print" to generate some
serial traffic. It went for a minute or so, but as soon as I have pressed
enter on my keyboard (generating traffic in an opposite direction),
I've got the recursive locking message (I have tried it twice, two
dumps are attached).

	Looking at the code, maybe the &port->slock in mxser_interrupt()
should be dropped before calling mxser_receive_chars() (and probably
also befor calling other mxser_* functions too). A proof-of-concept
patch attached - I am not able to reproduce the lockup with this patch.
Another approach would be to use in_interrupt() all over the code to
determine whether the lock should be re-aquired or not.

-Yenya

[ now the two "recursive locking" dumps, and the patch itself ]

=============================================
[ INFO: possible recursive locking detected ]
2.6.21-rc6 #8
---------------------------------------------
cu/1793 is trying to acquire lock:
 (&info->slock){++..}, at: [<ffffffff88003afd>] mxser_flush_chars+0x4d/0x83 [mxser_new]

but task is already holding lock:
 (&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]

other info that might help us debug this:
2 locks held by cu/1793:
 #0:  (&tty->atomic_write_lock){--..}, at: [<ffffffff802252dd>] tty_write+0x9e/0x211
 #1:  (&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]

stack backtrace:

Call Trace:
 <IRQ>  [<ffffffff8028c6e4>] __lock_acquire+0x155/0xbe1
 [<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
 [<ffffffff8028d1eb>] lock_acquire+0x7b/0x9f
 [<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
 [<ffffffff8025947c>] _spin_lock_irqsave+0x2d/0x3e
 [<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
 [<ffffffff80212a9b>] n_tty_receive_buf+0xd26/0xddc
 [<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
 [<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
 [<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
 [<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
 [<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
 [<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
 [<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
 [<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
 [<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
 [<ffffffff80253aa6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff80259406>] _spin_unlock_irqrestore+0x40/0x44
 [<ffffffff802175dc>] write_chan+0x2e0/0x2ff
 [<ffffffff80275d26>] default_wake_function+0x0/0xe
 [<ffffffff802253b6>] tty_write+0x177/0x211
 [<ffffffff802172fc>] write_chan+0x0/0x2ff
 [<ffffffff802142a3>] vfs_write+0xce/0x157
 [<ffffffff80214baa>] sys_write+0x45/0x6e
 [<ffffffff8025355e>] system_call+0x7e/0x83

BUG: spinlock lockup on CPU#1, cu/1793, ffffffff880082a8

Call Trace:
 <IRQ>  [<ffffffff80207632>] _raw_spin_lock+0xcc/0xf3
 [<ffffffff80259484>] _spin_lock_irqsave+0x35/0x3e
 [<ffffffff88003afd>] :mxser_new:mxser_flush_chars+0x4d/0x83
 [<ffffffff80212a9b>] n_tty_receive_buf+0xd26/0xddc
 [<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
 [<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
 [<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
 [<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
 [<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
 [<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
 [<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
 [<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
 [<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
 [<ffffffff80253aa6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff80259406>] _spin_unlock_irqrestore+0x40/0x44
 [<ffffffff802175dc>] write_chan+0x2e0/0x2ff
 [<ffffffff80275d26>] default_wake_function+0x0/0xe
 [<ffffffff802253b6>] tty_write+0x177/0x211
 [<ffffffff802172fc>] write_chan+0x0/0x2ff
 [<ffffffff802142a3>] vfs_write+0xce/0x157
 [<ffffffff80214baa>] sys_write+0x45/0x6e
 [<ffffffff8025355e>] system_call+0x7e/0x83


-----------------------------------------------------------------------
Another one:

=============================================
[ INFO: possible recursive locking detected ]
����ӡ���2.6.21-rc6 #8
---------------------------------------------
swapper/0 is trying to acquire lock:
 (&info->slock){++..}, at: [<ffffffff88003c6d>] mxser_stop+0x1c/0x43 [mxser_new]

but task is already holding lock:
 (&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]

other info that might help us debug this:
1 lock held by swapper/0:
 #0:  (&info->slock){++..}, at: [<ffffffff88003ead>] mxser_interrupt+0xc2/0x237 [mxser_new]

stack backtrace:

Call Trace:
 <IRQ>  [<ffffffff8028c6e4>] __lock_acquire+0x155/0xbe1
 [<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
 [<ffffffff8028d1eb>] lock_acquire+0x7b/0x9f
 [<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
 [<ffffffff8025947c>] _spin_lock_irqsave+0x2d/0x3e
 [<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
 [<ffffffff8021207a>] n_tty_receive_buf+0x305/0xddc
 [<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
 [<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
 [<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
 [<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
 [<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
 [<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
 [<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
 [<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
 [<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
 [<ffffffff8025f6ae>] default_idle+0x35/0x51
 [<ffffffff8025f679>] default_idle+0x0/0x51
 [<ffffffff80253aa6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff8025f679>] default_idle+0x0/0x51
 [<ffffffff8025f6ae>] default_idle+0x35/0x51
 [<ffffffff8025f6b0>] default_idle+0x37/0x51
 [<ffffffff8025f6ae>] default_idle+0x35/0x51
 [<ffffffff802420fc>] cpu_idle+0x57/0x76


BUG: spinlock lockup on CPU#1, swapper/0, ffffffff880082a8

Call Trace:
 <IRQ>  [<ffffffff80207632>] _raw_spin_lock+0xcc/0xf3
 [<ffffffff80259484>] _spin_lock_irqsave+0x35/0x3e
 [<ffffffff88003c6d>] :mxser_new:mxser_stop+0x1c/0x43
 [<ffffffff8021207a>] n_tty_receive_buf+0x305/0xddc
 [<ffffffff8028d0db>] __lock_acquire+0xb4c/0xbe1
 [<ffffffff803377b1>] flush_to_ldisc+0x3d/0x191
 [<ffffffff8033788e>] flush_to_ldisc+0x11a/0x191
 [<ffffffff803378a5>] flush_to_ldisc+0x131/0x191
 [<ffffffff880025aa>] :mxser_new:mxser_receive_chars+0x25f/0x26e
 [<ffffffff88003f83>] :mxser_new:mxser_interrupt+0x198/0x237
 [<ffffffff8020ed84>] handle_IRQ_event+0x1e/0x51
 [<ffffffff8029ccd4>] handle_fasteoi_irq+0x9a/0xd9
 [<ffffffff80260ccd>] do_IRQ+0x6f/0xd6
 [<ffffffff8025f6ae>] default_idle+0x35/0x51
 [<ffffffff8025f679>] default_idle+0x0/0x51
 [<ffffffff80253aa6>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff8025f679>] default_idle+0x0/0x51
 [<ffffffff8025f6ae>] default_idle+0x35/0x51
 [<ffffffff8025f6b0>] default_idle+0x37/0x51
[...]

-----------------------------------------------------------------------

Signed-off-by: Jan "Yenya" Kasprzak <kas@...muni.cz>

--- linux-2.6.21-rc6/drivers/char/mxser_new.c.orig	2007-04-14 11:50:54.000000000 +0200
+++ linux-2.6.21-rc6/drivers/char/mxser_new.c	2007-04-14 12:14:28.000000000 +0200
@@ -2352,27 +2352,43 @@
 					if (iir == MOXA_MUST_IIR_GDA ||
 					    iir == MOXA_MUST_IIR_RDA ||
 					    iir == MOXA_MUST_IIR_RTO ||
-					    iir == MOXA_MUST_IIR_LSR)
+					    iir == MOXA_MUST_IIR_LSR) {
+						spin_unlock(&port->slock);
 						mxser_receive_chars(port,
 								&status);
+						spin_lock(&port->slock);
+					}
 
 				} else {
 					status &= port->read_status_mask;
-					if (status & UART_LSR_DR)
+					if (status & UART_LSR_DR) {
+						spin_unlock(&port->slock);
 						mxser_receive_chars(port,
 								&status);
+						spin_lock(&port->slock);
+					}
 				}
 				msr = inb(port->ioaddr + UART_MSR);
-				if (msr & UART_MSR_ANY_DELTA)
+				if (msr & UART_MSR_ANY_DELTA) {
+					spin_unlock(&port->slock);
 					mxser_check_modem_status(port, msr);
+					spin_lock(&port->slock);
+				}
 
 				if (port->board->chip_flag) {
 					if (iir == 0x02 && (status &
-								UART_LSR_THRE))
+							UART_LSR_THRE)) {
+						spin_unlock(&port->slock);
+						
 						mxser_transmit_chars(port);
+						spin_lock(&port->slock);
+					}
 				} else {
-					if (status & UART_LSR_THRE)
+					if (status & UART_LSR_THRE) {
+						spin_unlock(&port->slock);
 						mxser_transmit_chars(port);
+						spin_lock(&port->slock);
+					}
 				}
 			} while (int_cnt++ < MXSER_ISR_PASS_LIMIT);
 			spin_unlock(&port->slock);
-- 
| Jan "Yenya" Kasprzak  <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839      Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/    Journal: http://www.fi.muni.cz/~kas/blog/ |
> I will never go to meetings again because I think  face to face meetings <
> are the biggest waste of time you can ever have.        --Linus Torvalds <
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ