[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <OS0PR01MB5922847D826B17D19CB5DEA8862E9@OS0PR01MB5922.jpnprd01.prod.outlook.com>
Date: Mon, 24 Oct 2022 16:42:26 +0000
From: Biju Das <biju.das.jz@...renesas.com>
To: Marc Kleine-Budde <mkl@...gutronix.de>
CC: Wolfgang Grandegger <wg@...ndegger.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Vincent Mailhol <mailhol.vincent@...adoo.fr>,
Stefan Mätje <stefan.maetje@....eu>,
Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@...renesas.com>,
Ulrich Hecht <uli+renesas@...nd.eu>,
Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Rob Herring <robh@...nel.org>,
"linux-can@...r.kernel.org" <linux-can@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Geert Uytterhoeven <geert+renesas@...der.be>,
Chris Paterson <Chris.Paterson2@...esas.com>,
"linux-renesas-soc@...r.kernel.org"
<linux-renesas-soc@...r.kernel.org>
Subject: RE: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo receive
Hi Marc,
> Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> receive
>
> On 22.10.2022 09:15:01, Biju Das wrote:
> > We are seeing IRQ storm on global receive IRQ line under heavy CAN
> bus
> > load conditions with both CAN channels are enabled.
> >
> > Conditions:
> > The global receive IRQ line is shared between can0 and can1,
> either
> > of the channels can trigger interrupt while the other channel irq
> > line is disabled(rfie).
> > When global receive IRQ interrupt occurs, we mask the interrupt in
> > irqhandler. Clearing and unmasking of the interrupt is happening
> in
> > rx_poll(). There is a race condition where rx_poll unmask the
> > interrupt, but the next irq handler does not mask the irq due to
> > NAPIF_STATE_MISSED flag.
>
> Why does this happen?
It is due to race between rx_poll() and interrupt triggered by other
Channel.
> Is it a problem that you call
> rcar_canfd_handle_global_receive() for a channel that has the IRQs
> actually disabled in hardware?
Yes, Due to other channel triggering interrupt and executing
the same call for channel0 again.
Scenario:
Channel0 IRQ line is disabled because of RXFiFo ch0 status in IRQ
and it schedule NAPI call. Before executing rx_poll, you get another
interrupt due to channel1 IRQ. Since RXFifo status is still set,
it will call napi_sched_prep() and state become missed.
Assume rx_poll() called it clear and unmask the IRQ line. This
time we get an IRQ from Channel0, since the state is missed state,
the line will be unmasked and we get IRQ storm
Finally, It will be like this you have an interrupt, which is not cleared
Leading to IRQ storm.
Cheers,
Biju
Powered by blists - more mailing lists