[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z8BboX9RxZBSXRr1@shell.armlinux.org.uk>
Date: Thu, 27 Feb 2025 12:33:37 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Furong Xu <0x1207@...il.com>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
Alexandre Torgue <alexandre.torgue@...s.st.com>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Emil Renner Berthing <kernel@...il.dk>,
Eric Dumazet <edumazet@...gle.com>,
Fabio Estevam <festevam@...il.com>, imx@...ts.linux.dev,
Inochi Amaoto <inochiama@...il.com>,
Jakub Kicinski <kuba@...nel.org>,
Jan Petrous <jan.petrous@....nxp.com>,
Jon Hunter <jonathanh@...dia.com>,
linux-arm-kernel@...ts.infradead.org,
linux-stm32@...md-mailman.stormreply.com,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
Minda Chen <minda.chen@...rfivetech.com>, netdev@...r.kernel.org,
NXP S32 Linux Team <s32@....com>, Paolo Abeni <pabeni@...hat.com>,
Pengutronix Kernel Team <kernel@...gutronix.de>,
Sascha Hauer <s.hauer@...gutronix.de>,
Shawn Guo <shawnguo@...nel.org>,
Thierry Reding <treding@...dia.com>, hailong.fan@...ngine.com,
2694439648@...com
Subject: Re: net: stmmac: weirdness in stmmac_hw_setup()
On Wed, Feb 26, 2025 at 10:43:26AM +0800, Furong Xu wrote:
> On Thu, 20 Feb 2025 16:17:43 +0000, "Russell King (Oracle)" wrote:
>
> > While looking through the stmmac driver, I've come across some
> > weirdness in stmmac_hw_setup() which looks completely barmy to me.
> >
> > It seems that it follows the initialisation suggested by Synopsys
> > (as best I can determine from the iMX8MP documentation), even going
> > as far as to *enable* transmit and receive *before* the network
> > device has been administratively brought up. stmmac_hw_setup() does
> > this:
> >
> > /* Enable the MAC Rx/Tx */
> > stmmac_mac_set(priv, priv->ioaddr, true);
> >
> > which sets the TE and RE bits in the MAC configuration register.
> >
> > This means that if the network link is active, packets will start
> > to be received and will be placed into the receive descriptors.
> >
> > We won't transmit anything because we won't be placing packets in
> > the transmit descriptors to be transmitted.
> >
> > However, this in stmmac_hw_setup() is just wrong. Can it be deleted
> > as per the below?
> > Tested-by: Thierry Reding <treding@...dia.com>
> >
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index c2913f003fe6..d6e492f523f5 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -3493,9 +3493,6 @@ static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
> > priv->hw->rx_csum = 0;
> > }
> >
> > - /* Enable the MAC Rx/Tx */
> > - stmmac_mac_set(priv, priv->ioaddr, true);
> > -
> > /* Set the HW DMA mode and the COE */
> > stmmac_dma_operation_mode(priv);
> >
>
> A better fix here:
> https://lore.kernel.org/all/tencent_CCC29C4F562F2DEFE48289DB52F4D91BDE05@qq.com/
>
> I can reproduce and confirm that patch works well.
> I was going to leave a Tested-by: tag on that patch if everything looks good enough,
> but the author (now copied) never come back.
As I now have access to the databook, the initialisation guidance given
in appendix E is very clear that at least the RE bit should not be set
until initialisation of both DMA and MAC have been completed.
RE allows the MAC to start receiving packets and placing them in the
FIFO for the DMA block to then transfer them to memory.
If the link is down, then there should be no activity on the receive
interface, so not setting RE until the link is up should have no
effect. Therefore, start_rx methods that set this bit should not be
doing so, and we should not set this bit until mac_link_up() has
been called.
Analysing the code for the RE bit, we have:
1. xxx_set_mac() (called via stmmac_mac_set()) which sets/clears both
the RE and TE bits together. This is called with a "true" argument
from:
1.1. stmmac_mac_link_up() - correct, the link has come up.
1.2. stmmac_hw_setup() - incorrect, we don't know whether the link
is up or down. I think simply removing this would be correct,
provided the patches that move phylink_resume() later in
stmmac_resume() are merged (will be posted non-RFC only once my
current set of stmmac clock changes have been merged.)
1.3. stmmac_xdp_open() - probably incorrect, we don't know whether
the link is up or down. stmmac_xdp_open() is only called from
stmmac_xdp_set_prog(), and the network interface must be
running. This implies that everything has already been
initialised, and phylink has been started. stmmac_xdp_open()
also messes with the carrier, which will upset phylink.
I think the right solution here is to call phylink_stop() in
stmmac_xdp_release() and phylink_start() in stmmac_xdp_open(),
removing the stmmac_mac_set() and netif_carrier*() calls in
both these functions.
2. xxx_pmt() (called via stmmac_pmt()) sets RE if (e.g. dwxgmac2)
WAKE_MAGIC or WAKE_UCAST is set.
This brings up questions. The only place it's called with a
potentially non-zero argument is in stmmac_suspend(). A few lines
above is:
/* Stop TX/RX DMA */
stmmac_stop_all_dma(priv);
which is a reasonable thing to do when suspending. However, the
DW databook states clearly that RE should not be set if DMA is
not able to empty the FIFO between the MAC and DMA otherwise it
will overflow. So... I think this causes overflow there. I'm not
sure what effect that has. Has wake-on-lan been tested?
3. xxx_start_rx() (called via stmmac_start_rx()) sets RE along with
starting the requested queue.
This is called in two places:
3.1. stmmac_test_flowctrl(). It looks to me like this needs the link
to already be up, which means that RE must already be set. Note
that the call to stmmac_stop_rx() does not clear RE.
3.2. stmmac_start_rx_dma() which has two callers.
3.2.1. stmmac_start_all_dma(). This is called from:
3.2.1.1. the end of stmmac_hw_setup(), which has already called
stmmac_mac_set() which would've set the RE bit. Same
arguments as for 1.2 for why this is wrong.
3.2.2. the end of stmmac_enable_rx_queue(), which is called from
stmmac_xdp_enable_pool(). Very similar to 3.1 above.
TE seems to be a very similar story with the exception of xxx_pmt(),
except the functions are "tx" rather than "rx".
So, my plan with regard to RE is:
1. Add phylink_start() and phylink_stop() into stmmac_xdp_open()
and stmmac_xdp_release() to the link is properly managed.
2. Remove setting RE from xxx_start_rx().
3. Remove stmmac_mac_set(, true) from everywhere except
stmmac_mac_link_up().
4. Remove stmmac_mac_set(, false) from stmmac_release(). This has
already called phylink_stop(), which will have synchronously called
stmmac_mac_link_down().
5. With (1) addressed stmmac_mac_set(, false) and netif_carrier_off()
both become redundant in stmmac_xdp_release().
6. Remove stmmac_stop_all_dma() and stmmac_mac_set(, false) in
stmmac_dvr_remove(). Not only is this inherently racy (the network
device is *still* *published* *to* *userspace* here, but one of
the jobs unregister_netdev() does is to take the network device
down before unregistering - in other words, .ndo_stop() will
have been called and completed before this returns, which will have
the effect of calling stmmac_mac_link_down() (which will call
stmmac_mac_set(, false), and stmmac_release() will have already
called stmmac_stop_all_dma().
(6) is a frequent programming mistake that I see in lots of network
drivers. It seems most people assume that unregister_netdev() does
nothing, and they need to manually take stop the MAC and take down the
link. :(
I'm going to leave xxx_pmt() alone for the moment as although that's
questionable, I think feedback from people using WoL is necessary.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists