lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 29 Mar 2017 11:45:41 +0200 From: Olliver Schinagl <o.schinagl@...imaker.com> To: Andy Shevchenko <andy.shevchenko@...il.com>, Douglas Anderson <dianders@...omium.org>, Cal Sullivan <california.l.sullivan@...el.com> Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, linux-rockchip@...ts.infradead.org, "linux-serial@...r.kernel.org" <linux-serial@...r.kernel.org>, guennadi.liakhovetski@...el.com, jslaby@...e.com, Jeffy Chen <jeffy.chen@...k-chips.com>, eric.gao@...k-chips.com, briannorris@...omium.org, dev@...ux-sunxi.org, linux-rockchip@...ts.infradead.org, wangkefeng.wang@...wei.com, noamc@...hip.com, heikki.krogerus@...ux.intel.com, jason.uy@...adcom.com, ed.blake@...tec.com, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, andriy.shevchenko@...ux.intel.com, guennadi.liakhovetski@...el.com Subject: Re: [PATCH v2] serial: 8250_dw: Avoid "too much work" from bogus rx timeout interrupt Hey Andy, On 29-03-17 11:11, Andy Shevchenko wrote: > On Wed, Mar 29, 2017 at 10:58 AM, Olliver Schinagl <oliver@...inagl.nl> wrote: >> On 07-02-17 00:30, Douglas Anderson wrote: > > First of all I didn't get why people from Cc list are suddenly > disappeared. Check your mail client settings. > Returning back some of them. Appologies, I replied via gmane's news feed to Douglas's initial post as I did not have the original post and I failed to check the other recipients. My fault. Sorry. I've added the original others as well. > >>> It appears that somehow we have a RX Timeout interrupt but there is no >>> actual data present to receive. When we're in this state the UART >>> driver claims that it handled the interrupt but it actually doesn't >>> really do anything. This means that we keep getting the interrupt >>> over and over again. > >> I may be running into the same thing on an A20 SoC, but still in the stage >> of figuring out what is going on, as we get this error very occasionally. Do >> you have a way to externally induce this behavior other then suspend/resume? >> As we get it during uart-use and do not have (or I have never tried) >> suspend/resume on our platform. > > On Intel platforms with this IP I can see similar when run loopback > test on high speeds. > California may correct me since he did a lot of investigation of the > issue on x86. > >>> static int dw8250_handle_irq(struct uart_port *p) >>> { >>> + struct uart_8250_port *up = up_to_u8250p(p); >>> struct dw8250_data *d = p->private_data; >>> unsigned int iir = p->serial_in(p, UART_IIR); >>> + unsigned int status; >>> + unsigned long flags; >>> + >>> + /* >>> + * There are ways to get Designware-based UARTs into a state where >>> + * they are asserting UART_IIR_RX_TIMEOUT but there is no actual >>> + * data available. If we see such a case then we'll do a bogus >>> + * read. If we don't do this then the "RX TIMEOUT" interrupt will >>> + * fire forever. >> >> I think what you are saying is 'do a bogus read as that is the only way to >> clear the interrupt, otherwise it will keep firing forever.'? > > No, we don't know if this _the only way_. It looks like no one from us > can tell you a root cause, except may be Synopsys guys. Has anybody tried to contact synopsis/dw about this issue at all? true, it is not the only way (maybe only as far as we know for now) but it is 'the' way currently. > >>> + spin_lock_irqsave(&p->lock, flags); >> >> this is a bit above my knowledge of driver etc, but I don't any spinlocks in >> the 8250 handle_irq glue drivers, except in the OMAP's case where they are >> handeling a DMA IRQ. So I ask, because I don't know, why is it needed here? > > They serialize IO accessors. > > Regarding to the rest comments, the patch is already in upstream, if > you feel that something should be changed, send an incremental fix. Ah, I thought I checked, but thought I didn't see it. I'll probably forgot to fetch. I'll send a patch for the small mask fix. > >> Once I found a way to reproduce the problem (without suspend) I will test >> this to see if it fixes it for us too. > > It would be appreciated, but better to get know the root cause and > what _hardware_ guys think about solutions. > I read over the docs of the IP block (I know a little FPGA programming) (dw_apb_uart of 2006) but found nothing yet that would warn for this behavior. I suppose hardware/fgpa guys can give more background here potentially, but it may also be simply an IP bug? Olliver
Powered by blists - more mailing lists