lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Mar 2016 01:52:42 +0100
From:	Matthias Schiffer <mschiffer@...verse-factory.net>
To:	Greg KH <gregkh@...uxfoundation.org>
Cc:	Ralf Baechle <ralf@...ux-mips.org>, jslaby@...e.com,
	linux-mips@...ux-mips.org, linux-serial@...r.kernel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Nonterministic hang during bootconsole/console handover on ath79

On 03/22/2016 12:08 AM, Greg KH wrote:
> On Tue, Mar 22, 2016 at 12:02:57AM +0100, Matthias Schiffer wrote:
>> Hi,
>> we're experiencing weird nondeterministic hangs during bootconsole/console
>> handover on some ath79 systems on OpenWrt. I've seen this issue myself on
>> kernel 3.18.23~3.18.27 on a AR7241-based system, but according to other
>> reports ([1], [2]) kernel 4.1.x is affected as well, and other SoCs like
>> QCA953x likewise.
> 
> Can you try 4.4 or ideally, 4.5?  There's been a lot of console/tty
> fixes/changes since the obsolete 3.18 kernel you are using...
> 
> thanks,
> 
> greg k-h
> 

With 4.4, I was not able to reproduce this hang, but I have no idea if this
is caused by an actual bugfix, or just random timing changes hiding the
bug. I suspect the latter might be the case (as I wrote in my first mail,
even minor differences in kernel images of the same version and the same
config make the hang more or less probable.) I was not yet able to test
4.5, as OpenWrt is a hell of kernel patches...

On 3.18, I also tried other things like disabling the early console
altogether, which also made the hang go away, but as even much smaller
changes hid the bug, this doesn't really say much.

The basic code path during the console handover seems to be the same in
3.18 and 4.4, even though a few functions have been moved; the relevant
part of the log looks the same:

> [    0.756298] Serial: 8250/16550 driver, 16 ports, IRQ sharing enabled
> [    0.766754] console [ttyS0] disabled
> [    0.790293] serial8250.0: ttyS0 at MMIO 0x18020000 (irq = 11, base_baud = 12500000) is a 16550A
> [    0.798909] console [ttyS0] enabled
> [    0.798909] console [ttyS0] enabled
> [    0.805854] bootconsole [early0] disabled
> [    0.805854] bootconsole [early0] disabled

So, in propect of an actual bugfix or backport, this boils down to two
questions, which I hope the serial or MIPS maintainers can answer me:

* Is it sane to have two console drivers using the same serial port? In
particular, is it sane for the early console to use the serial port after
serial8250_config_port has reset/configured it, but before the rest of the
setup of uart_configure_port has run? (this would be the case for the
message "serial8250.0: ttyS0 at MMIO...")
* Is it possible to get the serial controller into a state in which
early_printk might wait for THRE forever?

Thanks,
Matthias



Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)

Powered by blists - more mailing lists