[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAD=FV=VmYK5PeGXUcbJrnwnhdyfbNJLdRjwhbj7JCN5s-JmTAw@mail.gmail.com>
Date: Fri, 18 Jul 2025 08:20:32 -0700
From: Doug Anderson <dianders@...omium.org>
To: John Ogness <john.ogness@...utronix.de>
Cc: yunhui cui <cuiyunhui@...edance.com>, arnd@...db.de,
andriy.shevchenko@...ux.intel.com, benjamin.larsson@...exis.eu,
gregkh@...uxfoundation.org, heikki.krogerus@...ux.intel.com,
ilpo.jarvinen@...ux.intel.com, jirislaby@...nel.org,
jkeeping@...usicbrands.com, linux-kernel@...r.kernel.org,
linux-serial@...r.kernel.org, markus.mayer@...aro.org, matt.porter@...aro.org,
namcao@...utronix.de, paulmck@...nel.org, pmladek@...e.com,
schnelle@...ux.ibm.com, sunilvl@...tanamicro.com, tim.kryger@...aro.org,
stable@...r.kernel.org
Subject: Re: [External] Re: [PATCH v9 2/4] serial: 8250_dw: fix PSLVERR on RX_TIMEOUT
Hi,
On Thu, Jul 17, 2025 at 7:14 AM John Ogness <john.ogness@...utronix.de> wrote:
>
> Added Douglas Anderson, author of commit 424d79183af0 ("serial: 8250_dw:
> Avoid "too much work" from bogus rx timeout interrupt").
>
> On 2025-07-11, yunhui cui <cuiyunhui@...edance.com> wrote:
> >> On 2025-06-23, yunhui cui <cuiyunhui@...edance.com> wrote:
> >> >> The DW UART may trigger the RX_TIMEOUT interrupt without data
> >> >> present and remain stuck in this state indefinitely. The
> >> >> dw8250_handle_irq() function detects this condition by checking
> >> >> if the UART_LSR_DR bit is not set when RX_TIMEOUT occurs. When
> >> >> detected, it performs a "dummy read" to recover the DW UART from
> >> >> this state.
> >> >>
> >> >> When the PSLVERR_RESP_EN parameter is set to 1, reading the UART_RX
> >> >> while the FIFO is enabled and UART_LSR_DR is not set will generate a
> >> >> PSLVERR error, which may lead to a system panic. There are two methods
> >> >> to prevent PSLVERR: one is to check if UART_LSR_DR is set before reading
> >> >> UART_RX when the FIFO is enabled, and the other is to read UART_RX when
> >> >> the FIFO is disabled.
> >> >>
> >> >> Given these two scenarios, the FIFO must be disabled before the
> >> >> "dummy read" operation and re-enabled afterward to maintain normal
> >> >> UART functionality.
> >> >>
> >> >> Fixes: 424d79183af0 ("serial: 8250_dw: Avoid "too much work" from bogus rx timeout interrupt")
> >> >> Signed-off-by: Yunhui Cui <cuiyunhui@...edance.com>
> >> >> Cc: stable@...r.kernel.org
> >> >> ---
> >> >> drivers/tty/serial/8250/8250_dw.c | 10 +++++++++-
> >> >> 1 file changed, 9 insertions(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
> >> >> index 1902f29444a1c..082b7fcf251db 100644
> >> >> --- a/drivers/tty/serial/8250/8250_dw.c
> >> >> +++ b/drivers/tty/serial/8250/8250_dw.c
> >> >> @@ -297,9 +297,17 @@ static int dw8250_handle_irq(struct uart_port *p)
> >> >> uart_port_lock_irqsave(p, &flags);
> >> >> status = serial_lsr_in(up);
> >> >>
> >> >> - if (!(status & (UART_LSR_DR | UART_LSR_BI)))
> >> >> + if (!(status & (UART_LSR_DR | UART_LSR_BI))) {
> >> >> + /* To avoid PSLVERR, disable the FIFO first. */
> >> >> + if (up->fcr & UART_FCR_ENABLE_FIFO)
> >> >> + serial_out(up, UART_FCR, 0);
> >> >> +
> >> >> serial_port_in(p, UART_RX);
> >> >>
> >> >> + if (up->fcr & UART_FCR_ENABLE_FIFO)
> >> >> + serial_out(up, UART_FCR, up->fcr);
> >> >> + }
> >> >> +
> >> >> uart_port_unlock_irqrestore(p, flags);
> >> >> }
> >>
> >> I do not know enough about the hardware. Is a dummy read really the only
> >> way to exit the RX_TIMEOUT state?
> >>
> >> What if there are bytes in the TX-FIFO. Are they in danger of being
> >> cleared?
> >>
> >> From [0] I see:
> >>
> >> "Writing a "0" to bit 0 will disable the FIFOs, in essence turning the
> >> UART into 8250 compatibility mode. In effect this also renders the rest
> >> of the settings in this register to become useless. If you write a "0"
> >> here it will also stop the FIFOs from sending or receiving data, so any
> >> data that is sent through the serial data port may be scrambled after
> >> this setting has been changed. It would be recommended to disable FIFOs
> >> only if you are trying to reset the serial communication protocol and
> >> clearing any working buffers you may have in your application
> >> software. Some documentation suggests that setting this bit to "0" also
> >> clears the FIFO buffers, but I would recommend explicit buffer clearing
> >> instead using bits 1 and 2."
> >>
> >> Have you performed tests where you fill the TX-FIFO and then
> >> disable/enable the FIFO to see if the TX-bytes survive?
> >
> > Sorry, I haven't conducted relevant tests. The reason I made this
> > modification is that it clearly contradicts the logic of avoiding
> > PSLVERR. Disabling the FIFO can at least prevent the Panic() caused by
> > PSVERR.
>
> I am just wondering if there is some other way to avoid this. But since
> we are talking about a hardware quirk and it is only related to
> suspend/resume, maybe it is acceptable to risk data corruption in this
> case. (?)
>
> I am hoping Douglas can chime in.
>
> John Ogness
>
> >> [0] https://en.wikibooks.org/wiki/Serial_Programming/8250_UART_Programming
I'm not sure I have too much to add here. :( I did the investigation
and wrote the original patch over 8 years ago and I no longer have
access to the hardware where the problem first reproduced. I vaguely
remember the problem, but only because I re-read the commit message I
wrote 8 years ago. :-P
I will say that for the hardware I was working with, it wouldn't have
been the end of the world if there was a tiny bit of UART corruption
around suspend / resume. Of course, nothing about the workaround
specifically checks for suspend/resume, that was just how we were
reproducing problems. If there is ever any other way to get a "RX
timeout with no data" then we'd also potentially cause corruption with
the new patch. Still better than an interrupt storm or a panic,
though...
Not to say that I'm NAKing anything (since I'm a bit of a bystander in
this case), but I wonder if there's anything you could do better?
Ideas, maybe?
1. Could the PSLVERR be ignored in this case?
2. Could we temporarily disable generation of the PSLVERR for this read?
3. Could we detect when PSLVERR_RESP_EN=1 and only do the FIFO
disable/enable dance in that case?
-Doug
Powered by blists - more mailing lists