[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a6bwapij.fsf@jogness.linutronix.de>
Date: Thu, 05 May 2022 00:48:28 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Marek Szyprowski <m.szyprowski@...sung.com>,
Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-amlogic@...ts.infradead.org
Subject: Re: [PATCH printk v5 1/1] printk: extend console_lock for
per-console locking
On 2022-05-04, John Ogness <john.ogness@...utronix.de> wrote:
> I can reproduce the apparent stack corruption with qemu:
>
> [ 5.545268] task:pr/ttyAMA0 state:S stack: 0 pid: 26 ppid: 2 flags:0x00000008
> [ 5.545520] Call trace:
> [ 5.545620] __switch_to+0x104/0x160
> [ 5.545796] __schedule+0x2f4/0x9f0
> [ 5.546122] schedule+0x54/0xd0
> [ 5.546206] 0x0
I believe I am chasing a ghost. I can rather easily reproduce these
strange call traces, but if another sysrq-t is sent afterwards, the call
trace is OK. Also, I added trace_dump_stack() into the printk-kthread
main loop to dump the stack on every iteration. There I never see any
corruption, even though the timestamps are near the sysrq-t dump showing
corruption. Moving trace_dump_stack() into
amba-pl011:pl011_console_write() also showed no stack corruption at very
near times when sysrq-t did.
And it should be noted that the console-hanging issues reported in this
thread _cannot_ be reproduced with qemu.
So I will stop focussing on this "corrupt stack" thing and instead
investigate what the meson driver is doing that causes it to get
stuck. Since interrupts do not even fire, I'm guessing that the RX
interrupts are not being re-enabled (AML_UART_RX_INT_EN) for some code
path. This bit is only explicitly set once, in
meson_uart_startup(). Whenever the bit is cleared, later the previous
value is restored. This is assumed to mean the interrupt gets
re-enabled. But if there is some code path where multiple CPUs can
modify the register, then the interrupt could end up permanently
disabled.
I will go through and check if all access to AML_UART_CONTROL is
protected by port->lock.
John
Powered by blists - more mailing lists