[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<TYWPR01MB11030FA7F8B6B30B8A7759847D83D2@TYWPR01MB11030.jpnprd01.prod.outlook.com>
Date: Tue, 10 Dec 2024 06:46:04 +0000
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh@...esas.com>
To: nikita.yoush <nikita.yoush@...entembedded.com>, Andrew Lunn
<andrew@...n.ch>, "David S. Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Geert Uytterhoeven <geert+renesas@...der.be>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-renesas-soc@...r.kernel.org" <linux-renesas-soc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Michael Dege
<michael.dege@...esas.com>, Christian Mardmoeller
<christian.mardmoeller@...esas.com>, Dennis Ostermann
<dennis.ostermann@...esas.com>, nikita.yoush
<nikita.yoush@...entembedded.com>
Subject: RE: [PATCH net] net: renesas: rswitch: handle stop vs interrupt race
Hello Nikita-san,
Thank you for your patch!
> From: Nikita Yushchenko, Sent: Monday, December 9, 2024 8:32 PM
>
> Currently the stop routine of rswitch driver does not immediately
> prevent hardware from continuing to update descriptors and requesting
> interrupts.
>
> It can happen that when rswitch_stop() executes the masking of
> interrupts from the queues of the port being closed, napi poll for
> that port is already scheduled or running on a different CPU. When
> execution of this napi poll completes, it will unmask the interrupts.
> And unmasked interrupt can fire after rswitch_stop() returns from
> napi_disable() call. Then, the handler won't mask it, because
> napi_schedule_prep() will return false, and interrupt storm will
> happen.
>
> This can't be fixed by making rswitch_stop() call napi_disable() before
> masking interrupts. In this case, the interrupt storm will happen if
> interrupt fires between napi_disable() and masking.
>
> Fix this by checking for priv->opened_ports bit when unmasking
> interrupts after napi poll. For that to be consistent, move
> priv->opened_ports changes into spinlock-protected areas, and reorder
> other operations in rswitch_open() and rswitch_stop() accordingly.
We should add a Fixes tag for net.git here. I think the following tag is better because
the first commit had this issue. Although this fixing patch cannot be applied on
the first commit, I believe this is no matter about the Fixes tag [1].
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
> Signed-off-by: Nikita Yushchenko <nikita.yoush@...entembedded.com>
I could not apply this patch on net.git / main branch and the branch + your patches [2]
though, the fixed code looks good. So,
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@...esas.com>
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/5.Posting.rst?h=v6.12#n204
[2]
https://patchwork.kernel.org/project/netdevbpf/list/?series=915669
Best regards,
Yoshihiro Shimoda
> ---
> drivers/net/ethernet/renesas/rswitch.c | 33 ++++++++++++++------------
> 1 file changed, 18 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
> index 6ca5f72193eb..a33f74e1c447 100644
> --- a/drivers/net/ethernet/renesas/rswitch.c
> +++ b/drivers/net/ethernet/renesas/rswitch.c
> @@ -918,8 +918,10 @@ static int rswitch_poll(struct napi_struct *napi, int budget)
>
> if (napi_complete_done(napi, budget - quota)) {
> spin_lock_irqsave(&priv->lock, flags);
> - rswitch_enadis_data_irq(priv, rdev->tx_queue->index, true);
> - rswitch_enadis_data_irq(priv, rdev->rx_queue->index, true);
> + if (test_bit(rdev->port, priv->opened_ports)) {
> + rswitch_enadis_data_irq(priv, rdev->tx_queue->index, true);
> + rswitch_enadis_data_irq(priv, rdev->rx_queue->index, true);
> + }
> spin_unlock_irqrestore(&priv->lock, flags);
> }
>
> @@ -1582,20 +1584,20 @@ static int rswitch_open(struct net_device *ndev)
> struct rswitch_device *rdev = netdev_priv(ndev);
> unsigned long flags;
>
> - phy_start(ndev->phydev);
> + if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS))
> + iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDIE);
>
> napi_enable(&rdev->napi);
> - netif_start_queue(ndev);
>
> spin_lock_irqsave(&rdev->priv->lock, flags);
> + bitmap_set(rdev->priv->opened_ports, rdev->port, 1);
> rswitch_enadis_data_irq(rdev->priv, rdev->tx_queue->index, true);
> rswitch_enadis_data_irq(rdev->priv, rdev->rx_queue->index, true);
> spin_unlock_irqrestore(&rdev->priv->lock, flags);
>
> - if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS))
> - iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDIE);
> + phy_start(ndev->phydev);
>
> - bitmap_set(rdev->priv->opened_ports, rdev->port, 1);
> + netif_start_queue(ndev);
>
> return 0;
> };
> @@ -1607,7 +1609,16 @@ static int rswitch_stop(struct net_device *ndev)
> unsigned long flags;
>
> netif_tx_stop_all_queues(ndev);
> +
> + phy_stop(ndev->phydev);
> +
> + spin_lock_irqsave(&rdev->priv->lock, flags);
> + rswitch_enadis_data_irq(rdev->priv, rdev->tx_queue->index, false);
> + rswitch_enadis_data_irq(rdev->priv, rdev->rx_queue->index, false);
> bitmap_clear(rdev->priv->opened_ports, rdev->port, 1);
> + spin_unlock_irqrestore(&rdev->priv->lock, flags);
> +
> + napi_disable(&rdev->napi);
>
> if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS))
> iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDID);
> @@ -1620,14 +1631,6 @@ static int rswitch_stop(struct net_device *ndev)
> kfree(ts_info);
> }
>
> - spin_lock_irqsave(&rdev->priv->lock, flags);
> - rswitch_enadis_data_irq(rdev->priv, rdev->tx_queue->index, false);
> - rswitch_enadis_data_irq(rdev->priv, rdev->rx_queue->index, false);
> - spin_unlock_irqrestore(&rdev->priv->lock, flags);
> -
> - phy_stop(ndev->phydev);
> - napi_disable(&rdev->napi);
> -
> return 0;
> };
>
> --
> 2.39.5
Powered by blists - more mailing lists