[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <434bff80-0b89-4fe5-beb2-4b70a4b600d8@gmail.com>
Date: Tue, 3 Feb 2026 18:28:45 +0200
From: Ovidiu Panait <ovidiu.panait.oss@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Alexandre Torgue <alexandre.torgue@...s.st.com>,
Andrew Lunn <andrew@...n.ch>, Andrew Lunn <andrew+netdev@...n.ch>,
Clark Wang <xiaoning.wang@....com>,
Daniel Scally <dan.scally@...asonboard.com>,
"David S. Miller" <davem@...emloft.net>,
Emanuele Ghidoli <ghidoliemanuele@...il.com>,
Eric Dumazet <edumazet@...gle.com>, Fabio Estevam <festevam@...il.com>,
Heiner Kallweit <hkallweit1@...il.com>, imx@...ts.linux.dev,
Jakub Kicinski <kuba@...nel.org>,
Kieran Bingham <kieran.bingham@...asonboard.com>,
linux-arm-kernel@...ts.infradead.org,
linux-stm32@...md-mailman.stormreply.com,
Maxime Coquelin <mcoquelin.stm32@...il.com>, netdev@...r.kernel.org,
Oleksij Rempel <o.rempel@...gutronix.de>, Paolo Abeni <pabeni@...hat.com>,
Pengutronix Kernel Team <kernel@...gutronix.de>,
Rob Herring <robh@...nel.org>, Sascha Hauer <s.hauer@...gutronix.de>,
Shawn Guo <shawnguo@...nel.org>, Stefan Klug <stefan.klug@...asonboard.com>,
Wei Fang <wei.fang@....com>,
Laurent Pinchart <laurent.pinchart@...asonboard.com>
Subject: Re: [PATCH RFC net-next] net: stmmac: provide flag to disable EEE
On 2/3/26 5:43 PM, Russell King (Oracle) wrote:
> On Tue, Feb 03, 2026 at 05:42:07PM +0200, Ovidiu Panait wrote:
>>
>> Hi Russell,
>>
>> On 2/3/26 12:23 AM, Russell King (Oracle) wrote:
>>> On Mon, Feb 02, 2026 at 08:54:52PM +0200, Ovidiu Panait wrote:
>>>>
>>>> Hi Russell,
>>>>
>>>> On 11/24/25 1:27 PM, Russell King (Oracle) wrote:
>>>>> Some platforms have problems when EEE is enabled, and thus need a way
>>>>> to disable stmmac EEE support. Add a flag before the other LPI related
>>>>> flags which tells stmmac to avoid populating the phylink LPI
>>>>> capabilities, which causes phylink to call phy_disable_eee() for any
>>>>> PHY that is attached to the affected phylink instance.
>>>>>
>>>>> iMX8MP is an example - the lpi_intr_o signal is wired to an OR gate
>>>>> along with the main dwmac interrupts. Since lpi_intr_o is synchronous
>>>>> to the receive clock domain, and takes four clock cycles to clear, this
>>>>> leads to interrupt storms as the interrupt remains asserted for some
>>>>> time after the LPI control and status register is read.
>>>>>
>>>>> This problem becomes worse when the receive clock from the PHY stops
>>>>> when the receive path enters LPI state - which means that lpi_intr_o
>>>>> can not deassert until the clock restarts. Since the LPI state of the
>>>>> receive path depends on the link partner, this is out of our control.
>>>>> We could disable RX clock stop at the PHY, but that doesn't get around
>>>>> the slow-to-deassert lpi_intr_o mentioned in the above paragraph.
>>>>>
>>>>> Previously, iMX8MP worked around this by disabling gigabit EEE, but
>>>>> this is insufficient - the problem is also visible at 100M speeds,
>>>>> where the receive clock is slower.
>>>>>
>>>>> There is extensive discussion and investigation in the thread linked
>>>>> below, the result of which is summarised in this commit message.
>>>>>
>>>>
>>>> We are seeing the same lpi_intr_o interrupt storm on the Renesas RZ/V2H
>>>> EVK (dwmac-renesas-gbeth.c). On this platform, lpi_intr_o is routed as a
>>>> separate, dedicated interrupt line to the CPU rather than being OR'd
>>>> with the main DWMAC interrupt as on iMX8MP. This corresponds to the
>>>> "eth_lpi" interrupt in the stmmac bindings:
>>>> """
>>>> - description: The interrupt that occurs when Rx exits the LPI state
>>>> const: eth_lpi
>>>> """
>>>>
>>>> Looking through the other glue drivers/device-trees, it looks to me that
>>>> every platform that defines a separate "eth_lpi" irq might have the
>>>> interrupt storm problem.
>>>
>>> That is highly likely.
>>>
>>>> To fix this issue on these platforms, rather than disabling EEE
>>>> altogether, would it be possible to just not request the eth_lpi
>>>> interrupt and let EEE continue to work? Perhaps a new flag could let
>>>> each platform decide.
>>>
>>> Yes, because lpi_intr_o serves no purpose from a software point of
>>> view - see the commit message below for the details. I do like
>>> removing code from stmmac :)
>>>
>>>> If not, maybe this patch could be merged to add the flag that disables
>>>> EEE and I will just send a patch to disable EEE on our platforms as well.
>>>
>>> We still need the flag to disable EEE for platforms where lpi_intr_o is
>>> logically OR'd with the other interrupts, so there's no way to ignore
>>> its persistent assertion.
>>>
>>> 8<===
>>> From: "Russell King (Oracle)" <rmk+kernel@...linux.org.uk>
>>> Subject: [PATCH net-next] net: stmmac: remove support for lpi_intr_o
>>>
>>> The dwmac databook for v3.74a states that lpi_intr_o is a sideband
>>> signal which should be used to ungate the application clock, and this
>>> signal is synchronous to the receive clock. The receive clock can run
>>> at 2.5, 25 or 125MHz depending on the media speed, and can stop under
>>> the control of the link partner. This means that the time it takes to
>>> clear is dependent on the negotiated media speed, and thus can be 8,
>>> 40, or 400ns after reading the LPI control and status register.
>>>
>>> It has been observed with some aggressive link partners, this clock
>>> can stop while lpi_intr_o is still asserted, meaning that the signal
>>> remains asserted for an indefinite period that the local system has
>>> no direct control over.
>>>
>>> The LPI interrupts will still be signalled through the main interrupt
>>> path in any case, and this path is not dependent on the receive clock.
>>>
>>> This, since we do not gate the application clock, and the chances of
>>> adding clock gating in the future are slim due to the clocks being
>>> ill-defined, lpi_intr_o serves no useful purpose. Remove the code which
>>> requests the interrupt, and all associated code.
>>>
>>> Signed-off-by: Russell King (Oracle) <rmk+kernel@...linux.org.uk>
>>
>> Thanks for fixing this. I did some testing on the Renesas RZ/V2H board
>> with this patch and didn't see any issues:
>>
>> Tested-by: Ovidiu Panait <ovidiu.panait.rb@...esas.com>
>
> Would you say this is a regression or a new problem?
I don't think it is a regression per se. AFAICT this behavior was always
present as long as EEE is enabled, but I only noticed it after I changed
the network switch the board was connected to.
Thanks,
Ovidiu
Powered by blists - more mailing lists