lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CA+V-a8uWY1Av8eS1k9C6Td=RuB4PbCnQyXbNLzmhao0nr8Spbg@mail.gmail.com>
Date: Sat, 8 Mar 2025 13:20:15 +0000
From: "Lad, Prabhakar" <prabhakar.csengg@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley <conor+dt@...nel.org>, 
	Philipp Zabel <p.zabel@...gutronix.de>, Geert Uytterhoeven <geert+renesas@...der.be>, 
	Giuseppe Cavallaro <peppe.cavallaro@...com>, Jose Abreu <joabreu@...opsys.com>, 
	Alexandre Torgue <alexandre.torgue@...s.st.com>, netdev@...r.kernel.org, 
	devicetree@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-renesas-soc@...r.kernel.org, Biju Das <biju.das.jz@...renesas.com>, 
	Fabrizio Castro <fabrizio.castro.jz@...esas.com>, 
	Lad Prabhakar <prabhakar.mahadev-lad.rj@...renesas.com>
Subject: Re: [PATCH 3/3] net: stmmac: Add DWMAC glue layer for Renesas GBETH

Hi Russell,

On Thu, Mar 6, 2025 at 12:31 AM Russell King (Oracle)
<linux@...linux.org.uk> wrote:
>
> On Wed, Mar 05, 2025 at 09:26:43PM +0000, Lad, Prabhakar wrote:
> > I did investigate on these lines:
> >
> > 1] With my current patch series I have the below in remove callback
> >
> > +static void renesas_gbeth_remove(struct platform_device *pdev)
> > +{
> > +       struct renesas_gbeth *gbeth = get_stmmac_bsp_priv(&pdev->dev);
> > +
> > +       stmmac_dvr_remove(&pdev->dev);
> > +
> > +       clk_bulk_disable_unprepare(gbeth->num_clks, gbeth->clks);
> > +}
> >
> > After dumping the CLK registers I found out that the Rx and Rx-180 CLK
> > never got turned OFF after unbind.
>
> I think that's where further investigation needs to happen. This
> suggests there's more enables than disables for these clocks, but
> there's nothing that I can see in your submitted driver that would
> account for that behaviour.
>
> > 2] I replaced the remove callback with below ie first turn OFF
> > Tx-180/Rx/Rx-180 clocks
> >
> > +static void renesas_gbeth_remove(struct platform_device *pdev)
> > +{
> > +       struct renesas_gbeth *gbeth = get_stmmac_bsp_priv(&pdev->dev);
> > +
> > +       clk_bulk_disable_unprepare(gbeth->num_clks, gbeth->clks);
> > +
> > +      stmmac_dvr_remove(&pdev->dev);
> > +}
> >
> > After dumping the CLK registers I confirmed all the clocks were OFF
> > (CSR/PCLK/Tx/Tx-180/Rx/Rx-180) after unbind operation. Now when I do a
> > bind operation Rx clock fails to enable, which is probably because the
> > PHY is not providing any clock.
>
> However, disabling clocks _before_ unregistering the net device is a
> bad thing to do! The netdev could still be in use.
>
> You can add:
>
>         if (ndev->phydev)
>                 phy_eee_rx_clock_stop(ndev->phydev, false);
>
> just before unregister_netdev() in stmmac_dvr_remove() only as a way
> to test out that idea.
>
I tried the above and I can still see the PHY providing the Rx clocks.

> Do the clock registers you refer to only update when the relevant
> clocks are actually running?
>
> > > However, PHYs that have negotiated EEE are permitted to stop their
> > > receive clock, which can be enabled by an appropriate control bit.
> > > phy_eee_rx_clock_stop() manipulates that bit. stmmac has in most
> > > cases permitted the PHY to stop its receive clock.
> > >
> > You mean phy_eee_rx_clock_stop() is the one which tells the PHY to
> > disable the Rx clocks? Actually I tried the below hunk with this as
> > well the Rx clock fails to be turned ON after unbind/bind operation.
>
> phy_eee_rx_clock_stop() doesn't turn the clock on/off per-se, it
> controls the bit which gives the PHY permission to disable the clock
> when the media is in EEE low-power mode. Note that 802.3 does not
> give a default setting for this bit, so:
>
> > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> > index 0ba434104f5b..e16f4a6f5715 100644
> > --- a/drivers/net/phy/phy.c
> > +++ b/drivers/net/phy/phy.c
> > @@ -1756,6 +1756,7 @@ EXPORT_SYMBOL_GPL(phy_eee_tx_clock_stop_capable);
> >   */
> >  int phy_eee_rx_clock_stop(struct phy_device *phydev, bool clk_stop_enable)
> >  {
> > +       return 0;
> >         int ret;
> >
> >         /* Configure the PHY to stop receiving xMII
>
> May not be wise, and if you want to ensure that the PHY does not stop
> the clock, then forcing clk_stop_enable to zero would be better.
>
> > > NVidia have been a recent victim of this - it is desirable to allow
> > > receive clock stop, but there hasn't been the APIs in the kernel
> > > to allow MAC drivers to re-enable the clock when they need it.
> > >
> > > Up until now, I had thought this was just a suspend/resume issue
> > > (which is NVidia's reported case). Your testing suggests that it is
> > > more widespread than that.
> > >
> > > While I've been waiting to hear from you, I've prepared some patches
> > > that change the solution that I proposed for NVidia (currently on top
> > > of that patch set).
> >
> > I tried your latest patches [0], this didnt resolve the issue.
>
> I assume without the modification to phy_eee_rx_clock_stop() above -
> thanks. If so, then your issue is not EEE related.
>
> > [0] https://lore.kernel.org/all/Z8bbnSG67rqTj0pH@shell.armlinux.org.uk/
>
> Wasn't quite the latest, but still had the same build bug (thanks for
> reporting, now fixed.) Latest is equivalent so no need to re-test.
>
Maybe looking at the behaviour on my SoC, we should request to disable
the PHY clocks upon module removal see below...

> > > However, before I proceed with them, I need you to get to the bottom
> > > of why:
> > >
> > > # ip li set dev $if down
> > > # ip li set dev $if up
> > >
> > > doesn't trigger it, but removing and re-inserting the module does.
> > >
> > Doing the above does not turn OFF/ON all the clocks. Looking at the
> > dump from the CLK driver on my platform only stmmaceth and pclk are
> > the clocks which get toggled and rest remain ON. Note Im not sure if
> > the PHY is disabling the Rx clocks when I run ip li set dev $if down I
> > cannot scope that pin on the board.
> >
> > Please let me know if you have any pointers for me to look further
> > into this issue.
>
> Given the behaviour that you're reporting from your clock layer, I'm
> wondering if the problem is actually there... it seems weird that clocks
> aren't being turned off and on when they should.
>
You were right, the problem was indeed in the clock layer. On the SoC
we have two registers one for clock gating (turn ON/OFF) and other for
monitoring the clock (ie CLOK_ON and CLK_MON registers). When we turn
ON the clock the CLK_ON bit gets set and to make sure it's turned ON
the corresponding CLK_MON bit is checked to ensure it's ON. When a
request is made to turn ON the clock first we check the CLK_MON bit
and if it's being set we return early as the clock was ON. This worked
OK up until now where the clocks used were internally generated.

In the case of RGMII interface where the Rx/Rx-180 clock was coming
from an PHY on an external pin the above didn't work as expected. When
we issued an unbind request on the glue driver all the clocks were
gated to OFF state i.e CLK_ON bits were set to '0'. Now when the bind
operation was requested  the clocks were requested to be turned ON, ie
when CLK_MON bits for RX/Rx-180 reported to be '1'  that is because
PHY was providing the clock and due to which the CLK_ON bit was unset
(and not gated to ON state)  due to which the DMA reset operation
failed in our case.

After fixing the clock driver, to ignore CLK_MON bits for clocks
coming from external pins and just relying on CLK_ON bits I am seeing
no more DMA reset timeouts.

Below are the logs, for unbind/bind operation working for both
eth0/eth1 interfaces.
-------------------------------------------------------------------------------------------------------------
root@...2h-evk-alpha:~# ethtool --show-eee eth0
EEE Settings for eth0:
        EEE status: enabled - active
        Tx LPI: 1000000 (us)
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full
root@...2h-evk-alpha:~# ethtool --show-eee eth1
EEE Settings for eth1:
        EEE status: enabled - inactive
        Tx LPI: 1000000 (us)
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
        Link partner advertised EEE link modes:  Not reported
root@...2h-evk-alpha:~#
root@...2h-evk-alpha:~# cd /sys/bus/platform/drivers/renesas-gbeth/ &&
echo 15c30000.ethernet > unbind && cd -
[  440.623862] renesas-gbeth 15c30000.ethernet eth0:
stmmac_dvr_remove: removing driver
[  440.632176] renesas-gbeth 15c30000.ethernet eth0: Link is Down
/home/root
root@...2h-evk-alpha:~# cd /sys/bus/platform/drivers/renesas-gbeth/ &&
echo 15c30000.ethernet > bind && cd -
[  448.384065] renesas-gbeth 15c30000.ethernet: IRQ sfty not found
[  448.391491] renesas-gbeth 15c30000.ethernet: User ID: 0x1, Synopsys ID: 0x52
[  448.398735] renesas-gbeth 15c30000.ethernet:         DWMAC4/5
[  448.404007] renesas-gbeth 15c30000.ethernet: DMA HW capability
register supported
[  448.411633] renesas-gbeth 15c30000.ethernet: RX Checksum Offload
Engine supported
[  448.419245] renesas-gbeth 15c30000.ethernet: Wake-Up On Lan supported
[  448.425916] renesas-gbeth 15c30000.ethernet: Enable RX Mitigation
via HW Watchdog Timer
[  448.433977] renesas-gbeth 15c30000.ethernet: Enabled L3L4 Flow TC (entries=8)
[  448.441167] renesas-gbeth 15c30000.ethernet: Enabled RFS Flow TC (entries=10)
[  448.448352] renesas-gbeth 15c30000.ethernet: Using 32/32 bits DMA
host/device width
/home/root
[  448.468325] renesas-gbeth 15c30000.ethernet eth0: Register
MEM_TYPE_PAGE_POOL RxQ-0
[  448.476762] renesas-gbeth 15c30000.ethernet eth0: Register
MEM_TYPE_PAGE_POOL RxQ-1
[  448.484756] renesas-gbeth 15c30000.ethernet eth0: Register
MEM_TYPE_PAGE_POOL RxQ-2
[  448.492705] renesas-gbeth 15c30000.ethernet eth0: Register
MEM_TYPE_PAGE_POOL RxQ-3
root@...2h-evk-alpha:~# [  448.564797] renesas-gbeth 15c30000.ethernet
eth0: PHY [stmmac-0:00] driver [Microchip KSZ9131 Gigabit PHY]
(irq=POLL)
[  448.587333] dwmac4: Master AXI performs fixed burst length
[  448.592868] renesas-gbeth 15c30000.ethernet eth0: No Safety
Features support found
[  448.600466] renesas-gbeth 15c30000.ethernet eth0: IEEE 1588-2008
Advanced Timestamp supported
[  448.609409] renesas-gbeth 15c30000.ethernet eth0: registered PTP clock
[  448.616637] renesas-gbeth 15c30000.ethernet eth0: configuring for
phy/rgmii-id link mode
[  451.704179] renesas-gbeth 15c30000.ethernet eth0: Link is Up -
1Gbps/Full - flow control rx/tx

root@...2h-evk-alpha:~# cd /sys/bus/platform/drivers/renesas-gbeth/ &&
echo 15c40000.ethernet > unbind && cd -
[  486.269883] renesas-gbeth 15c40000.ethernet eth1:
stmmac_dvr_remove: removing driver
[  486.278374] renesas-gbeth 15c40000.ethernet eth1: Link is Down
[  486.323261] audit: type=1334 audit(1741436947.780:15): prog-id=13 op=LOAD
[  486.330102] audit: type=1334 audit(1741436947.788:16): prog-id=14 op=LOAD
/home/root
root@...2h-evk-alpha:~#
root@...2h-evk-alpha:~# cd /sys/bus/platform/drivers/renesas-gbeth/ &&
echo 15c40000.ethernet > bind && cd -
[  491.951054] renesas-gbeth 15c40000.ethernet: IRQ sfty not found
[  491.958576] renesas-gbeth 15c40000.ethernet: User ID: 0x0, Synopsys ID: 0x52
[  491.965819] renesas-gbeth 15c40000.ethernet:         DWMAC4/5
[  491.971087] renesas-gbeth 15c40000.ethernet: DMA HW capability
register supported
[  491.978705] renesas-gbeth 15c40000.ethernet: RX Checksum Offload
Engine supported
[  491.986423] renesas-gbeth 15c40000.ethernet: Wake-Up On Lan supported
[  491.992927] renesas-gbeth 15c40000.ethernet: Enable RX Mitigation
via HW Watchdog Timer
[  492.001011] renesas-gbeth 15c40000.ethernet: device MAC address
de:08:73:a2:96:42
[  492.008547] renesas-gbeth 15c40000.ethernet: Enabled L3L4 Flow TC (entries=8)
[  492.015793] renesas-gbeth 15c40000.ethernet: Enabled RFS Flow TC (entries=10)
[  492.022941] renesas-gbeth 15c40000.ethernet: Using 32/32 bits DMA
host/device width
/home/root
root@...2h-evk-alpha:~# [  492.045756] renesas-gbeth 15c40000.ethernet
eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
[  492.055375] renesas-gbeth 15c40000.ethernet eth1: Register
MEM_TYPE_PAGE_POOL RxQ-1
[  492.063568] renesas-gbeth 15c40000.ethernet eth1: Register
MEM_TYPE_PAGE_POOL RxQ-2
[  492.071575] renesas-gbeth 15c40000.ethernet eth1: Register
MEM_TYPE_PAGE_POOL RxQ-3
[  492.123684] renesas-gbeth 15c40000.ethernet eth1: PHY [stmmac-1:00]
driver [Microchip KSZ9131 Gigabit PHY] (irq=POLL)
[  492.146301] dwmac4: Master AXI performs fixed burst length
[  492.151840] renesas-gbeth 15c40000.ethernet eth1: No Safety
Features support found
[  492.159438] renesas-gbeth 15c40000.ethernet eth1: IEEE 1588-2008
Advanced Timestamp supported
[  492.168266] renesas-gbeth 15c40000.ethernet eth1: registered PTP clock
[  492.175237] renesas-gbeth 15c40000.ethernet eth1: configuring for
phy/rgmii-id link mode

root@...2h-evk-alpha:~# [  495.255185] renesas-gbeth 15c40000.ethernet
eth1: Link is Up - 1Gbps/Full - flow control off
[  527.352545] audit: type=1334 audit(1741436988.809:17): prog-id=14 op=UNLOAD
[  527.359538] audit: type=1334 audit(1741436988.809:18): prog-id=13 op=UNLOAD

root@...2h-evk-alpha:~#

Cheers,
Prabhakar

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ