[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200630105512.GA530@jagdpanzerIV.localdomain>
Date: Tue, 30 Jun 2020 19:55:12 +0900
From: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Andy Shevchenko <andy.shevchenko@...il.com>,
Raul Rangel <rrangel@...gle.com>,
Tony Lindgren <tony@...mide.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
kurt@...utronix.de, "S, Shirish" <Shirish.S@....com>,
Peter Zijlstra <peterz@...radead.org>,
John Ogness <john.ogness@...utronix.de>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: UART/TTY console deadlock
On (20/06/30 12:21), Petr Mladek wrote:
> > So... Do we need to hold uart->port when we disable port->irq? What do we
> > race with? Module removal? The function bumps device PM counter (albeit
> > for UART_CAP_RPM ports only).
>
> Honestly, I do not see where a PM counter gets incremented.
serial8250_do_startup()
serial8250_rpm_get()
pm_runtime_get_sync(p->port.dev)
But this does not happen for all ports, just for UART_CAP_RPM ones.
> Anyway, __disable_irq_nosync() does nothing when
> irq_get_desc_buslock() returns NULL. And irq_get_desc_buslock()
> takes desc->lock when desc exist. This should be enough to
> synchronize any calls.
>
> > But, at the same time, we do a whole bunch
> > of unprotected port->FOO accesses in serial8250_do_startup(). We even set
> > the IRQF_SHARED up->port.irqflags without grabbing the port->lock:
> >
> > up->port.irqflags |= IRQF_SHARED;
> > spin_lock_irqsave(&port->lock, flags);
> > if (up->port.irqflags & IRQF_SHARED)
> > disable_irq_nosync(port->irq);
>
> Yup, this looks suspicious. We set a flag in port.irqflags and take the lock
> only when the flag was set. Either everything needs to be done under
> the lock or the lock is not needed.
>
> Well, I might have missed something. I do not fully understand meaning
> and relation of all the structures.
>
> Anyway, I believe that this is a false positive. If I get it correctly
> serial8250_do_startup() must be called before the serial port could
> be registered as a console. It means that it could not be called
> from inside printk().
>From my understanding, I'm afraid we are talking about actual deadlock
here, not about false positive report. Quoting the original email:
: We are trying an S3 suspend stress test and occasionally while
: entering S3 we get a console deadlock.
[..]
> > drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> > index d64ca77d9cfa..ad30991e1b3b 100644
> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> >
> > if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> > unsigned char iir1;
> > + bool irq_shared = up->port.irqflags & IRQF_SHARED;
> > +
> > + if (irq_shared)
> > + disable_irq_nosync(port->irq);
> > +
> > /*
> > * Test for UARTs that do not reassert THRE when the
> > * transmitter is idle and the interrupt has already
> > @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> > * allow register changes to become visible.
> > */
> > spin_lock_irqsave(&port->lock, flags);
> > - if (up->port.irqflags & IRQF_SHARED)
> > - disable_irq_nosync(port->irq);
> >
> > wait_for_xmitr(up, UART_LSR_THRE);
> > serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> > iir = serial_port_in(port, UART_IIR);
> > serial_port_out(port, UART_IER, 0);
> >
> > - if (port->irqflags & IRQF_SHARED)
> > - enable_irq(port->irq);
> > spin_unlock_irqrestore(&port->lock, flags);
> > + if (irq_shared)
> > + enable_irq(port->irq);
> >
> > /*
> > * If the interrupt is not reasserted, or we otherwise
>
> I think that it might be safe but I am not 100% sure, sigh.
Yeah, I'm not 100%, but I'd give it a try.
-ss
Powered by blists - more mailing lists