[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHp75VdZxU67P9RPAmw=hnToR2H-8bfHvzeU4UCwKCpV5xEKNg@mail.gmail.com>
Date: Wed, 7 Oct 2020 17:03:58 +0300
From: Andy Shevchenko <andy.shevchenko@...il.com>
To: Marc Zyngier <maz@...nel.org>
Cc: Linus Walleij <linus.walleij@...aro.org>,
"open list:GPIO SUBSYSTEM" <linux-gpio@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Bartosz Golaszewski <bgolaszewski@...libre.com>,
kernel-team@...roid.com
Subject: Re: [PATCH] gpio: pca953x: Survive spurious interrupts
On Wed, Oct 7, 2020 at 4:20 PM Marc Zyngier <maz@...nel.org> wrote:
> On 2020-10-07 14:10, Andy Shevchenko wrote:
> > On Wed, Oct 7, 2020 at 3:09 PM Marc Zyngier <maz@...nel.org> wrote:
> >> On 2020-10-07 13:02, Andy Shevchenko wrote:
> >> > On Wed, Oct 7, 2020 at 12:49 PM Linus Walleij
> >> > <linus.walleij@...aro.org> wrote:
> >> >> On Mon, Oct 5, 2020 at 4:02 PM Marc Zyngier <maz@...nel.org> wrote:
> >> >>
> >> >> > The pca953x driver never checks the result of irq_find_mapping(),
> >> >> > which returns 0 when no mapping is found. When a spurious interrupt
> >> >> > is delivered (which can happen under obscure circumstances), the
> >> >> > kernel explodes as it still tries to handle the error code as
> >> >> > a real interrupt.
> >> >> >
> >> >> > Handle this particular case and warn on spurious interrupts.
> >> >> >
> >> >> > Signed-off-by: Marc Zyngier <maz@...nel.org>
> >> >
> >> > Wait, doesn't actually [1] fix the reported issue?
> >>
> >> Not at all.
> >>
> >> > Marc, can you confirm this?
> >> >
> >> > [1]: e43c26e12dd4 ("gpio: pca953x: Fix uninitialized pending variable")
> >>
> >> Different bug, really. If an interrupt is *really* pending, and no
> >> mapping established yet, feeding the result of irq_find_mapping() to
> >> handle_nested_irq() will lead to a panic.
> >
> > I don't understand. We have plenty of drivers doing exactly the way
> > without checking this returned code.
>
> I'm sure we do. Most driver code is buggy as hell, but I don't see that
> as a reason to cargo-cult the crap. The API is crystal clear that it can
> return 0 for no mapping, and 0 isn't a valid interrupt.
Yes, and the problem here is that we got this response from IRQ core,
which we shouldn't.
> > What circumstances makes the mapping be absent?
>
> Other bugs in the system ([1]), spurious interrupts (which can *always*
> happen).
>
> > Shouldn't we rather change this:
> >
> > girq->handler = handle_simple_irq;
> > to this:
> > girq->handler = handle_bad_irq;
> > ?
>
> I don't understand what you are trying to achieve with that, apart from
> maybe breaking the driver. The right way to handle spurious interrupts
> is by telling the core code that the interrupt wasn't handled, and to
> let
> the spurious interrupt code do its magic.
handle_bad_irq() is exactly for handling spurious IRQs as far as we
believe documentation. So, by default the driver assigns (should
assign) handle_bad_irq() to all IRQs as a default handler. If, by any
chance, we got it, we already have a proper handler in place. The read
handler is assigned whenever the IRQ core is called to register it (by
means of ->irq_set_type() callback). My understanding that GPIO IRQ
drivers are designed (should be designed) in this way. The approach
will make us sure that we don't have spurious interrupts with assigned
handlers.
> [1] https://lore.kernel.org/r/20201005111443.1390096-1-maz@kernel.org
--
With Best Regards,
Andy Shevchenko
Powered by blists - more mailing lists