[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1482152376.9552.96.camel@linux.intel.com>
Date: Mon, 19 Dec 2016 14:59:36 +0200
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: Douglas Anderson <dianders@...omium.org>,
gregkh@...uxfoundation.org, jslaby@...e.com
Cc: briannorris@...omium.org, linux-rockchip@...ts.infradead.org,
jeffy.chen@...k-chips.com, eric.gao@...k-chips.com,
peter@...leysoftware.com, phillip.raffeck@....de,
anton.wuerfel@....de, yegorslists@...glemail.com,
matwey@....msu.ru, tthayer@...nsource.altera.com,
linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] serial: 8250: Avoid "too much work" from bogus rx
timeout interrupt
On Sun, 2016-12-18 at 17:14 -0800, Douglas Anderson wrote:
> On a Rockchip rk3399-based board during suspend/resume testing, we
> found that we could get the console UART into a state where it would
> print this to the console a lot:
> serial8250: too much work for irq42
Have you read the following discussion
https://www.spinics.net/lists/kernel/msg2059543.html
>
> Followed eventually by:
> NMI watchdog: BUG: soft lockup - CPU#0 stuck for 11s!
>
> Upon debugging I found that we're in this state:
> iir = 0x000000cc
> lsr = 0x00000060
>
> It appears that somehow we have a RX Timeout interrupt but there is no
> actual data present to receive. When we're in this state the UART
> driver claims that it handled the interrupt but it actually doesn't
> really do anything. This means that we keep getting the interrupt
> over and over again.
>
> Normally we don't actually need to do anything special to handle a RX
> Timeout interrupt. We'll notice that there is some data ready and
> we'll read it, which will end up clearing the RX Timeout. In this
> case we have a problem specifically because we got the RX TImeout
> without any data. Reading a bogus byte is confirmed to get us out of
> this state.
>
> It's unclear how exactly the UART got into this state, but it is known
> that the UART lines are essentially undriven and unpowered during
> suspend, so possibly during resume some garbage / half transmitted
> bits are seen on the line and put the UART into this state.
>
> The UART on the rk3399 is a DesignWare based 8250 UART but I have
> placed this fix in the general 8250 code because it shouldn't hurt to
> have this detection on all 8250 UARTs and it's plausible some other
> UART could get into the same state. If these two extra lines of code
> are too much overhead, we can certainly move it into the DesignWare
> driver or even only do it for Rockchip UARTs.
>
> Signed-off-by: Douglas Anderson <dianders@...omium.org>
> ---
> Testing and development done on a kernel-4.4 based tree, then picked
> to ToT, where the code applied cleanly.
>
> drivers/tty/serial/8250/8250_port.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c
> b/drivers/tty/serial/8250/8250_port.c
> index fe4399b41df6..8582c068c3d1 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -1824,6 +1824,12 @@ int serial8250_handle_irq(struct uart_port
> *port, unsigned int iir)
> if (status & (UART_LSR_DR | UART_LSR_BI)) {
> if (!up->dma || handle_rx_dma(up, iir))
> status = serial8250_rx_chars(up, status);
> + } else if ((iir & 0x3f) == UART_IIR_RX_TIMEOUT) {
> + /*
> + * On some systems we saw the timeout interrupt even
> when
> + * there was no data ready. Do a bogus read to clear
> it.
> + */
> + (void) serial_port_in(port, UART_RX);
> }
> serial8250_modem_status(up);
> if ((!up->dma || up->dma->tx_err) && (status &
> UART_LSR_THRE))
--
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
Intel Finland Oy
Powered by blists - more mailing lists