lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 16 May 2024 21:22:17 -0700
From: Doug Brown <doug@...morgal.com>
To: Jonas Gorski <jonas.gorski@...il.com>, linux-kernel@...r.kernel.org,
 linux-serial@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Jiri Slaby <jirislaby@...nel.org>,
 Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
 Florian Fainelli <florian.fainelli@...adcom.com>, stable@...r.kernel.org
Subject: Re: [PATCH v2] serial: core: only stop transmit when HW fifo is empty

Hello,

On 3/3/2024 7:08 AM, Jonas Gorski wrote:
> If the circular buffer is empty, it just means we fit all characters to
> send into the HW fifo, but not that the hardware finished transmitting
> them.
> 
> So if we immediately call stop_tx() after that, this may abort any
> pending characters in the HW fifo, and cause dropped characters on the
> console.
> 
> Fix this by only stopping tx when the tx HW fifo is actually empty.
> 
> Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers")
> Cc: stable@...r.kernel.org
> Signed-off-by: Jonas Gorski <jonas.gorski@...il.com>
> ---
> (this is v2 of the bcm63xx-uart fix attempt)
> 
> v1 -> v2
> * replace workaround with fix for core issue
> * add Cc: for stable
> 
> I'm somewhat confident this is the core issue causing the broken output
> with bcm63xx-uart, and there is no actual need for the UART_TX_NOSTOP.
> 
> I wouldn't be surprised if this also fixes mxs-uart for which
> UART_TX_NOSTOP was introduced.
> 
> If it does, there is no need for the flag anymore.
>   include/linux/serial_core.h | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
> index 55b1f3ba48ac..bb0f2d4ac62f 100644
> --- a/include/linux/serial_core.h
> +++ b/include/linux/serial_core.h
> @@ -786,7 +786,8 @@ enum UART_TX_FLAGS {
>   	if (pending < WAKEUP_CHARS) {					      \
>   		uart_write_wakeup(__port);				      \
>   									      \
> -		if (!((flags) & UART_TX_NOSTOP) && pending == 0)	      \
> +		if (!((flags) & UART_TX_NOSTOP) && pending == 0 &&	      \
> +		    __port->ops->tx_empty(__port))			      \
>   			__port->ops->stop_tx(__port);			      \
>   	}								      \
>   									      \

I just upgraded to kernel 6.9 and discovered through a git bisect that
this patch (7bfb915a597a301abb892f620fe5c283a9fdbd77) causes a problem
with the legacy pxa.c serial driver (CONFIG_SERIAL_PXA_NON8250). I'm
using it with a PXA168-based ARM device for a serial console as well as
getty. With this patch applied, transmissions get hung up before they
finish. The data isn't lost, because the next time a transmit occurs,
the delayed data finally goes out -- but something seems to be causing
it to get stuck right at the end of many, but not all, transmissions.
For example, if I type "ps" and hit enter, nothing shows up until I hit
enter again, which finally kickstarts the whole TX process and then I
get all of the queued ps output.

I'm really confused about this symptom because it seems at face value
like this patch would only ever improve the situation by preventing
stop_tx() from being called too early. There's something about the pxa
driver that is happier when stop_tx() is called with an empty buffer
even if the UART is reporting that it's not empty yet. I tested some
other random systems in qemu and couldn't reproduce this issue, so the
problem may very well be limited just to this driver/hardware...

I realize this driver is old and deprecated (I'm likely one of the few
users left of it) so I'm hesitant to call it a regression. Maybe it's
really a bug in this driver that the new patch exposes? I even thought,
"heck, I should probably be using the newer 8250_pxa driver instead",
but that one is even worse -- it drops TX characters like crazy,
regardless of whether this patch is applied. I want to look into that
problem eventually.

I'm hoping there is some kind of simple fix that can be made to the pxa
driver to work around it with this new behavior. Can anyone think of a
reason that this driver would not like this change? It seems
counterintuitive to me -- the patch makes perfect sense.

Thanks,
Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ