lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date: Tue, 25 Jun 2024 09:24:44 -0700
From: Douglas Anderson <dianders@...omium.org>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jirislaby@...nel.org>
Cc: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
	Neil Armstrong <neil.armstrong@...aro.org>,
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
	Tony Lindgren <tony@...mide.com>,
	Uwe Kleine-König <u.kleine-koenig@...gutronix.de>,
	linux-arm-msm@...r.kernel.org,
	Bjorn Andersson <andersson@...nel.org>,
	Stephen Boyd <swboyd@...omium.org>,
	linux-serial@...r.kernel.org,
	Nícolas F . R . A . Prado <nfraprado@...labora.com>,
	John Ogness <john.ogness@...utronix.de>,
	linux-kernel@...r.kernel.org,
	Yicong Yang <yangyicong@...ilicon.com>,
	Johan Hovold <johan+linaro@...nel.org>,
	Konrad Dybcio <konrad.dybcio@...aro.org>,
	Douglas Anderson <dianders@...omium.org>,
	Rob Herring <robh@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Vijaya Krishna Nivarthi <quic_vnivarth@...cinc.com>
Subject: [PATCH] serial: qcom-geni: Avoid hard lockup if bytes are dropped

If you start sending a large chunk of text over the UART (like `cat
/var/log/messages`) and then hit Ctrl-C to stop it then, as of commit
1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo"), you'll
get a hard lockup. Specifically, the driver ends up looping in
qcom_geni_serial_send_chunk_fifo(). uart_fifo_out() will return 0
bytes were transferred and the loop will never make progress.

Avoid the hard lockup by making sure we never kick off a transfer that
is larger than we can queue up immediately.

The issue stems from the fact that the geni serial driver tried to be
more efficient by kicking off large transfers. It tried to do this
because the design of geni means that whenever we get to the end of a
transfer there is a period of time where the line goes idle while we
wait for an interrupt to start a new transfer.

The geni serial driver kicked off large transfers by peeking into the
Linux software FIFO and kicking off a transfer based on the number of
bytes there. While that worked (mostly), there was an unhandled corner
case when the Linux software FIFO shrank, as happens when you kill a
process that had queued up lots of data to send.

Prior to the recent kfifo change, the geni driver would keep sending
data that had been "removed" from the Linux software FIFO. While
definitely wrong, this didn't feel too terrible. In the above instance
of hitting Ctrl-C while catting a large file you'd see the file keep
spewing out to the console for a few hundred milliseconds after the
process died. As mentioned above, after the kfifo change we get a hard
lockup.

Digging into the geni serial driver shows a whole pile of issues and
those should be fixed. One patch series attempting to fix the issue
has had positive testing/reviews [1] but it's a fairly large change.
While debating / researching the right long term solution, this small
patch at least prevents the hard lockup.

NOTE: this change does have performance impacts. On my sc7180-trogdor
device I measured something like a 2% slowdown. Others has seen
something more like a 20-25% slowdown [2]. However, correctness trumps
performance so landing this makes sense while better solutions are
devised.

[1] https://lore.kernel.org/r/20240610222515.3023730-1-dianders@chromium.org
[2] https://lore.kernel.org/r/ZnraAlR9QeYhd628@hovoldconsulting.com

Fixes: 1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo")
Signed-off-by: Douglas Anderson <dianders@...omium.org>
---
As discussed with Johan [3], this probably makes sense to land as a
stopgap while we come to agreement on how to solve the larger issues.

NOTE: I'll be away from my work computer for the next 1.5 weeks.
Hopefully this can land in the meantime. If it needs spinning /
reworking I wouldn't object to someone else taking it on.

I've removed all "Tested-by" tags here since the code is pretty
different and it only solves a subset of the issues of the larger
series.

[3] https://lore.kernel.org/r/ZnraAlR9QeYhd628@hovoldconsulting.com

 drivers/tty/serial/qcom_geni_serial.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 2bd25afe0d92..fc202233a3ee 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -904,8 +904,8 @@ static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport,
 		goto out_write_wakeup;
 
 	if (!port->tx_remaining) {
-		qcom_geni_serial_setup_tx(uport, pending);
-		port->tx_remaining = pending;
+		qcom_geni_serial_setup_tx(uport, chunk);
+		port->tx_remaining = chunk;
 
 		irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
 		if (!(irq_en & M_TX_FIFO_WATERMARK_EN))
-- 
2.45.2.741.gdbec12cfda-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ