[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <OS0PR01MB59224B2AE8F84B961D2A061C862E9@OS0PR01MB5922.jpnprd01.prod.outlook.com>
Date: Mon, 24 Oct 2022 18:31:27 +0000
From: Biju Das <biju.das.jz@...renesas.com>
To: Marc Kleine-Budde <mkl@...gutronix.de>
CC: Wolfgang Grandegger <wg@...ndegger.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Vincent Mailhol <mailhol.vincent@...adoo.fr>,
Stefan Mätje <stefan.maetje@....eu>,
Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@...renesas.com>,
Ulrich Hecht <uli+renesas@...nd.eu>,
Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Rob Herring <robh@...nel.org>,
"linux-can@...r.kernel.org" <linux-can@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Geert Uytterhoeven <geert+renesas@...der.be>,
Chris Paterson <Chris.Paterson2@...esas.com>,
"linux-renesas-soc@...r.kernel.org"
<linux-renesas-soc@...r.kernel.org>
Subject: RE: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo receive
Hi Marc,
> Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> receive
>
> On 24.10.2022 16:55:56, Biju Das wrote:
> > Hi Marc,
> > > Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global
> > > fifo receive
> > >
> > > On 24.10.2022 17:37:35, Marc Kleine-Budde wrote:
> > > > On 22.10.2022 09:15:01, Biju Das wrote:
> > > > > We are seeing IRQ storm on global receive IRQ line under heavy
> > > > > CAN bus load conditions with both CAN channels are enabled.
> > > > >
> > > > > Conditions:
> > > > > The global receive IRQ line is shared between can0 and can1,
> > > either
> > > > > of the channels can trigger interrupt while the other
> channel
> > > irq
> > > > > line is disabled(rfie).
> > > > > When global receive IRQ interrupt occurs, we mask the
> > > > > interrupt
> > > in
> > > > > irqhandler. Clearing and unmasking of the interrupt is
> > > > > happening
> > > in
> > > > > rx_poll(). There is a race condition where rx_poll unmask
> the
> > > > > interrupt, but the next irq handler does not mask the irq
> due to
> > > > > NAPIF_STATE_MISSED flag.
> > > >
> > > > Why does this happen? Is it a problem that you call
> > > > rcar_canfd_handle_global_receive() for a channel that has the
> IRQs
> > > > actually disabled in hardware?
> > >
> > > Can you check if the IRQ is active _and_ enabled before handling
> the
> > > IRQ on a particular channel?
> >
> > You mean IRQ handler or rx_poll()??
>
> I mean the IRQ handler.
>
> Consider the IRQ for channel0 is disabled but active and the IRQ for
> channel1 is enabled and active. The
> rcar_canfd_global_receive_fifo_interrupt() will iterate over both
> channels, and rcar_canfd_handle_global_receive() will serve the
> channel0 IRQ, even if the IRQ is _not_ enabled. So I suggested to only
> handle a channel's RX IRQ if that IRQ is actually enabled.
>
> Assuming "cc & RCANFD_RFCC_RFI" checks if IRQ is enabled:
>
> index 567620d215f8..ea828c1bd3a1 100644
> --- a/drivers/net/can/rcar/rcar_canfd.c
> +++ b/drivers/net/can/rcar/rcar_canfd.c
> @@ -1157,11 +1157,13 @@ static void
> rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
> {
> struct rcar_canfd_channel *priv = gpriv->ch[ch];
> u32 ridx = ch + RCANFD_RFFIFO_IDX;
> - u32 sts;
> + u32 sts, cc;
>
> /* Handle Rx interrupts */
> sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv, ridx));
> - if (likely(sts & RCANFD_RFSTS_RFIF)) {
> + cc = rcar_canfd_read(priv->base, RCANFD_RFCC(gpriv, ridx));
> + if (likely(sts & RCANFD_RFSTS_RFIF &&
> + cc & RCANFD_RFCC_RFIE)) {
> if (napi_schedule_prep(&priv->napi)) {
> /* Disable Rx FIFO interrupts */
> rcar_canfd_clear_bit(priv->base,
>
> Please check if that fixes your issue.
Looks like your solution also will work.
Tomorrow will check and provide you feedback.
>
> > IRQ handler check the status and disable(mask) the IRQ line.
> > rx_poll() clears the status and enable(unmask) the IRQ line.
> >
> > Status flag is set by HW while line is in disabled/enabled state.
> >
> > Channel0 and channel1 has 2 IRQ lines within the IP which is ored
> > together to provide global receive interrupt(shared line).
>
> > > A more clearer approach would be to get rid of the global
> interrupt
> > > handlers at all. If the hardware only given 1 IRQ line for more
> than
> > > 1 channel, the driver would register an IRQ handler for each
> channel
> > > (with the shared attribute). The IRQ handler must check, if the
> IRQ
> > > is
> ^^^^^^^^^
> That should be "flag".
OK.
>
> > > pending and enabled. If not return IRQ_NONE, otherwise handle and
> > > return IRQ_HANDLED.
> >
> > That involves restructuring the IRQ handler altogether.
>
> ACK
>
> > RZ/G2L has shared line for rx fifos {ch0 and ch1} -> 2 IRQ routine
> > with shared attributes.
>
> It's the same IRQ handler (or IRQ routine), but called 1x for each
> channel, so 2x in total. The SHARED is actually a IRQ flag in the 4th
> argument in the devm_request_irq() function.
>
> | devm_request_irq(..., ..., ..., IRQF_SHARED, ..., ...);
>
> > R-Car SoCs has shared line for rx fifos {ch0 and ch1} and error
> > interrupts->3 IRQ routines with shared attributes.
>
> > R-CarV3U SoCs has shared line for rx fifos {ch0 to ch8} and error
> > interrupts->9 IRQ routines with shared attributes.
>
> I think you got the point, I just wanted to point out the usual way
> they are called.
>
> > Yes, I can send follow up patches for migrating to shared interrupt
> > handlers as enhancement. Please let me know.
>
> Please check if my patch snippet from above works. To fix the IRQ
> storm problem I'd like to have a simple and short solution that can go
> into stable before restructuring the IRQ handlers.
OK, Tomorrow will provide you the feedback.
Cheers,
Biju
Powered by blists - more mailing lists