lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Dec 2016 09:12:40 -0800
From:   Doug Anderson <dianders@...omium.org>
To:     Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Peter Hurley <peter@...leysoftware.com>,
        california.l.sullivan@...el.com
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>, jslaby@...e.com,
        Brian Norris <briannorris@...omium.org>,
        "open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>,
        陈渐飞 <jeffy.chen@...k-chips.com>,
        高美才 <eric.gao@...k-chips.com>,
        phillip.raffeck@....de, anton.wuerfel@....de,
        yegorslists@...glemail.com, matwey@....msu.ru,
        tthayer@...nsource.altera.com, linux-serial@...r.kernel.org,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] serial: 8250: Avoid "too much work" from bogus rx timeout interrupt

Hi,

On Mon, Dec 19, 2016 at 4:59 AM, Andy Shevchenko
<andriy.shevchenko@...ux.intel.com> wrote:
> On Sun, 2016-12-18 at 17:14 -0800, Douglas Anderson wrote:
>> On a Rockchip rk3399-based board during suspend/resume testing, we
>> found that we could get the console UART into a state where it would
>> print this to the console a lot:
>>   serial8250: too much work for irq42
>
> Have you read the following discussion
> https://www.spinics.net/lists/kernel/msg2059543.html

No, I wasn't aware of that discussion.  Yup, basically the exact same
thing is happening here.  Good to know I'm not alone.  Any idea if the
Baytrail UART is also based on DesignWare IP?

In that thread, Peter said:

> I think there is every likelihood of spurious RX timeout interrupts
> tripping this patch, sorry.
>
> Unfortunately, I think UART_BUG_ is the only viable possibility.
> Or perhaps fixing the port type as PORT_8250 (thus disabling the fifos).

My change is slightly different than California's in that I'm actually
throwing away the bogus byte and his patch was treating it as a valid
byte.  I don't know if that makes the patch more or less palatable.

I would hate to lose access to the FIFOs just due to this weird corner case.

Do we really think there's a case where there's an RX Timeout
interrupt w/ no "data ready" but that later the data ready will show
up?  Can you quantify how much later you think it will show up?  If we
can quantify how much longer the data will show up in then we should
probably just do a timeout loop right where I added my patch.

Specifically, here's what's happening today with RX Timeout interrupt
without "data ready":

1. We'll get the interrupt
2. We won't do _anything_ to service the interrupt.
3. We'll return back to serial8250_interrupt(), where we'll keep
looping until we get "too much work"
4. We'll break out, but the interrupt will still be active.
5. Go back to #1

...and since this interrupt will keep firing and firing and firing
with no delay in-between, we'll effectively lock the CPU up.

If there are some UARTs that eventually get themselves out of this
state by asserting "data ready" then the above won't be an "infinite"
loop but it will effectively be a tight loop where we won't let
userspace run and won't service other interrupts until we actually get
the data ready.  Since we're already blocking everything else, it
seems like it might be better to directly loop in
serial8250_handle_irq() with a timeout of some sort (how long?  100
us?  1 ms?).  Then we if we get the timeout then we can do the read
and safely work ourselves free.

What do others think about that?


-Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ