[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <22aac4ec-2a22-42ee-20ee-9e9d6097b9d9@gmail.com>
Date: Sun, 23 Jan 2022 10:29:49 -0800
From: Florian Fainelli <f.fainelli@...il.com>
To: Jisheng Zhang <jszhang@...nel.org>, Andrew Lunn <andrew@...n.ch>,
Joakim Zhang <qiangqing.zhang@....com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@...com>,
Alexandre Torgue <alexandre.torgue@...s.st.com>,
Jose Abreu <joabreu@...opsys.com>,
"David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
netdev@...r.kernel.org, linux-stm32@...md-mailman.stormreply.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: stmmac: don't stop RXC during LPI
On 1/23/2022 8:09 AM, Jisheng Zhang wrote:
> On Mon, Jan 24, 2022 at 12:08:22AM +0800, Jisheng Zhang wrote:
>> On Sun, Jan 23, 2022 at 04:52:29PM +0100, Andrew Lunn wrote:
>>> On Sun, Jan 23, 2022 at 10:12:45PM +0800, Jisheng Zhang wrote:
>>>> I met can't receive rx pkt issue with below steps:
>>>> 0.plug in ethernet cable then boot normal and get ip from dhcp server
>>>> 1.quickly hotplug out then hotplug in the ethernet cable
>>>> 2.trigger the dhcp client to renew lease
>>>>
>>>> tcpdump shows that the request tx pkt is sent out successfully,
>>>> but the mac can't receive the rx pkt.
>>>>
>>>> The issue can easily be reproduced on platforms with PHY_POLL external
>>>> phy. If we don't allow the phy to stop the RXC during LPI, the issue
>>>> is gone. I think it's unsafe to stop the RXC during LPI because the mac
>>>> needs RXC clock to support RX logic.
>>>>
>>>> And the 2nd param clk_stop_enable of phy_init_eee() is a bool, so use
>>>> false instead of 0.
>>>>
>>>> Signed-off-by: Jisheng Zhang <jszhang@...nel.org>
>>>> ---
>>>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> index 6708ca2aa4f7..92a9b0b226b1 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> @@ -1162,7 +1162,7 @@ static void stmmac_mac_link_up(struct phylink_config *config,
>>>>
>>>> stmmac_mac_set(priv, priv->ioaddr, true);
>>>> if (phy && priv->dma_cap.eee) {
>>>> - priv->eee_active = phy_init_eee(phy, 1) >= 0;
>>>> + priv->eee_active = phy_init_eee(phy, false) >= 0;
>>>
>>> This has not caused issues in the past. So i'm wondering if this is
>>> somehow specific to your system? Does everybody else use a PHY which
>>> does not implement this bit? Does your synthesis of the stmmac have a
>>> different clock tree?
>>>
>>> By changing this value for every instance of the stmmac, you are
>>> potentially causing a power regression for stmmac implementations
>>> which don't need the clock. So we need a clear understanding, stopping
>>> the clock is wrong in general and so the change is correct in
>>
>> I think this is a common issue because the MAC needs phy's RXC for RX
>> logic. But it's better to let other stmmac users verify. The issue
>> can easily be reproduced on platforms with PHY_POLL external phy.
>> Or other platforms use a dedicated clock rather than clock from phy
>> for MAC's RX logic?
>>
>> If the issue turns out specific to my system, then I will send out
>> a new patch to adopt your suggestion.
>>
>
> + Joakim
>
>> Hi Joakim, IIRC, you have stmmac + external RTL8211F phy platform, but
>> I'm not sure whether your platform have an irq for the phy. could you
>> help me to check whether you can reproduce the issue on your platform?
>>
>>> general. Or this is specific to your system, and you probably need to
>>> add priv->dma_cap.keep_rx_clock_ticking, which you set in your glue
>>> driver,and use here to decide what to pass to phy_init_eee().
I suspect the problem is only or largely relevant in a RGMII
configuration whereby the TXC of the MAC is an input to the PHY which
then re-generates the RXC and feeds it back to the MAC as RXC (with the
configured delay). If the PHY stops its clock, then MAC no longer gets a
RXC and all sorts of problems would arise if the MAC logic on the RX
side is dependent upon getting the PHY's RXC to be re-sampled internally
within the MAC.
Now, this would be symptomatic of a fairly naive design on the MAC side
to support EEE, also usually to really save power while in LPI you would
want to switch your MAC from its main or fast clock (which is presumably
at least 250MHz to support Gigabit rates and generate a 125MHz TXC) to a
slow clock (say 25 or 27MHz) in order to actually save power on the MAC
side (even if the bulk of the power is on the PHY's analog logic). When
the PHY signals that we are out of LPI the MAC switches back to its main
clock. This may occur with the help of the MAC driver, or this can be
done autonomously sometimes.
So with all that theory and how should things be designed and so on, I
think you need to investigate this problem a bit more thoroughly.
FWIW phy_init_eee()'s second argument is improperly designed. Before
deciding to stop the PHY's RX clock, you should first know whether the
PHY supports it to begin with, otherwise you are requesting something
the is not able to do, and there is no feedback mechanism. A while back
I had started this patch series which may still be relevant:
https://github.com/ffainelli/linux/commits/phy-eee-tx-clk
--
Florian
Powered by blists - more mailing lists