lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 31 Oct 2010 04:36:18 +0100
From:	Arnd Bergmann <arnd@...db.de>
To:	Eli Billauer <eli@...lauer.co.il>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: open() on /dev/tty takes 30 seconds on 2.6.36

On Sunday 31 October 2010, Eli Billauer wrote:
> Andrew Morton wrote:
> 
> First, I'll explain what I did: Since I can't login while this problem 
> occurs, I had to modify sshd's service script to start sshd with strace, 
> and also kick off a script (called who-is-locking.sh), which creates 
> that sysrq call every five seconds. This could also be the place to 
> mention, that I got an NULL dereference oops during one of these (IP at 
> get_usage_chars+0x0/0x10e, while in the who-is-locking process). But 
> that happened after the desired data was logged.

An oops report is interesting most of the time, except when you know
that you caused it with your own kernel changes. If you can still
find the oops report, please post it here.

> Oct 31 01:14:28 ocho kernel: Serial: 8250/16550 driver, 4 ports, IRQ 
> sharing enabled
> Oct 31 01:14:28 ocho kernel: serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is 
> a 16550A
> Oct 31 01:14:28 ocho kernel: 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> Oct 31 01:14:28 ocho kernel: serial 0000:05:04.0: PCI INT A -> GSI 16 
> (level, low) -> IRQ 16
> Oct 31 01:14:28 ocho kernel: 0000:05:04.0: ttyS1 at I/O 0xbe20 (irq = 
> 16) is a 8250
> Oct 31 01:14:28 ocho kernel: 0000:05:04.0: ttyS2 at I/O 0xbe28 (irq = 
> 16) is a 8250
> Oct 31 01:14:28 ocho kernel: 0000:05:04.0: ttyS3 at I/O 0xbe40 (irq = 
> 16) is a 8250
> Oct 31 01:14:28 ocho kernel: Couldn't register serial port 0000:05:04.0: -28

For some reason, your kernel finds five ports at boot time, while it's
been configured at build time to support only four.

> Oct 31 01:23:59 ocho kernel: =============================================
> Oct 31 01:23:59 ocho kernel:
> Oct 31 01:24:04 ocho kernel: SysRq : Show Locks Held
> Oct 31 01:24:04 ocho kernel:
> Oct 31 01:24:04 ocho kernel: Showing all locks held in the system:
> Oct 31 01:24:04 ocho kernel: 2 locks held by modem-manager/1835:
> Oct 31 01:24:04 ocho kernel: #0:  (big_tty_mutex){+.+.+.}, at: 
> [<ffffffff814b9c09>] tty_lock+0x3f/0x4a
> Oct 31 01:24:04 ocho kernel: #1:  (&port->mutex){+.+.+.}, at: 
> [<ffffffff812fcace>] uart_close+0x82/0x312

So modem-manager is the process you need to look at, as it holds the
BTM, probably for a long time. It should not do that. First you should
verify that when you don't have modem-manager running, the problem goes
away. This may indeed be related to the misdetection of a serial port.

The other thing to check is what modem-manager is actually doing. You can
find out by looking at /proc/<pid>/stack, where <pid> is the pid of the
modem-manager process, 1835 in your example.

You can also look at <sysrq>+t to get all the stack traces for the running
processes and finding the process in there. 

I think you're closing in on the bug, thanks to your providing exactly the
right parts of information.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists