lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <22aac4ec-2a22-42ee-20ee-9e9d6097b9d9@gmail.com>
Date:   Sun, 23 Jan 2022 10:29:49 -0800
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Jisheng Zhang <jszhang@...nel.org>, Andrew Lunn <andrew@...n.ch>,
        Joakim Zhang <qiangqing.zhang@....com>
Cc:     Giuseppe Cavallaro <peppe.cavallaro@...com>,
        Alexandre Torgue <alexandre.torgue@...s.st.com>,
        Jose Abreu <joabreu@...opsys.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Maxime Coquelin <mcoquelin.stm32@...il.com>,
        netdev@...r.kernel.org, linux-stm32@...md-mailman.stormreply.com,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: stmmac: don't stop RXC during LPI



On 1/23/2022 8:09 AM, Jisheng Zhang wrote:
> On Mon, Jan 24, 2022 at 12:08:22AM +0800, Jisheng Zhang wrote:
>> On Sun, Jan 23, 2022 at 04:52:29PM +0100, Andrew Lunn wrote:
>>> On Sun, Jan 23, 2022 at 10:12:45PM +0800, Jisheng Zhang wrote:
>>>> I met can't receive rx pkt issue with below steps:
>>>> 0.plug in ethernet cable then boot normal and get ip from dhcp server
>>>> 1.quickly hotplug out then hotplug in the ethernet cable
>>>> 2.trigger the dhcp client to renew lease
>>>>
>>>> tcpdump shows that the request tx pkt is sent out successfully,
>>>> but the mac can't receive the rx pkt.
>>>>
>>>> The issue can easily be reproduced on platforms with PHY_POLL external
>>>> phy. If we don't allow the phy to stop the RXC during LPI, the issue
>>>> is gone. I think it's unsafe to stop the RXC during LPI because the mac
>>>> needs RXC clock to support RX logic.
>>>>
>>>> And the 2nd param clk_stop_enable of phy_init_eee() is a bool, so use
>>>> false instead of 0.
>>>>
>>>> Signed-off-by: Jisheng Zhang <jszhang@...nel.org>
>>>> ---
>>>>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> index 6708ca2aa4f7..92a9b0b226b1 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> @@ -1162,7 +1162,7 @@ static void stmmac_mac_link_up(struct phylink_config *config,
>>>>   
>>>>   	stmmac_mac_set(priv, priv->ioaddr, true);
>>>>   	if (phy && priv->dma_cap.eee) {
>>>> -		priv->eee_active = phy_init_eee(phy, 1) >= 0;
>>>> +		priv->eee_active = phy_init_eee(phy, false) >= 0;
>>>
>>> This has not caused issues in the past. So i'm wondering if this is
>>> somehow specific to your system? Does everybody else use a PHY which
>>> does not implement this bit? Does your synthesis of the stmmac have a
>>> different clock tree?
>>>
>>> By changing this value for every instance of the stmmac, you are
>>> potentially causing a power regression for stmmac implementations
>>> which don't need the clock. So we need a clear understanding, stopping
>>> the clock is wrong in general and so the change is correct in
>>
>> I think this is a common issue because the MAC needs phy's RXC for RX
>> logic. But it's better to let other stmmac users verify. The issue
>> can easily be reproduced on platforms with PHY_POLL external phy.
>> Or other platforms use a dedicated clock rather than clock from phy
>> for MAC's RX logic?
>>
>> If the issue turns out specific to my system, then I will send out
>> a new patch to adopt your suggestion.
>>
> 
> + Joakim
> 
>> Hi Joakim, IIRC, you have stmmac + external RTL8211F phy platform, but
>> I'm not sure whether your platform have an irq for the phy. could you
>> help me to check whether you can reproduce the issue on your platform?
>>
>>> general. Or this is specific to your system, and you probably need to
>>> add priv->dma_cap.keep_rx_clock_ticking, which you set in your glue
>>> driver,and use here to decide what to pass to phy_init_eee().

I suspect the problem is only or largely relevant in a RGMII 
configuration whereby the TXC of the MAC is an input to the PHY which 
then re-generates the RXC and feeds it back to the MAC as RXC (with the 
configured delay). If the PHY stops its clock, then MAC no longer gets a 
RXC and all sorts of problems would arise if the MAC logic on the RX 
side is dependent upon getting the PHY's RXC to be re-sampled internally 
within the MAC.

Now, this would be symptomatic of a fairly naive design on the MAC side 
to support EEE, also usually to really save power while in LPI you would 
want to switch your MAC from its main or fast clock (which is presumably 
at least 250MHz to support Gigabit rates and generate a 125MHz TXC) to a 
slow clock (say 25 or 27MHz) in order to actually save power on the MAC 
side (even if the bulk of the power is on the PHY's analog logic). When 
the PHY signals that we are out of LPI the MAC switches back to its main 
clock. This may occur with the help of the MAC driver, or this can be 
done autonomously sometimes.

So with all that theory and how should things be designed and so on, I 
think you need to investigate this problem a bit more thoroughly.

FWIW phy_init_eee()'s second argument is improperly designed. Before 
deciding to stop the PHY's RX clock, you should first know whether the 
PHY supports it to begin with, otherwise you are requesting something 
the is not able to do, and there is no feedback mechanism. A while back 
I had started this patch series which may still be relevant:

https://github.com/ffainelli/linux/commits/phy-eee-tx-clk
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ