lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Aug 2022 12:01:25 +0200
From:   Jerome Brunet <jbrunet@...libre.com>
To:     Heiner Kallweit <hkallweit1@...il.com>,
        Da Xue <da@...sconfused.com>
Cc:     Giuseppe Cavallaro <peppe.cavallaro@...com>,
        Alexandre Torgue <alexandre.torgue@...s.st.com>,
        Jose Abreu <joabreu@...opsys.com>,
        Erico Nunes <nunes.erico@...il.com>, netdev@...r.kernel.org,
        linux-amlogic@...ts.infradead.org,
        Kevin Hilman <khilman@...libre.com>,
        Neil Armstrong <narmstrong@...libre.com>,
        Vyacheslav <adeep@...ina.in>, Qi Duan <qi.duan@...ogic.com>
Subject: Re: [RFC/RFT PATCH] net: stmmac: do not poke MAC_CTRL_REG twice on
 link up


On Fri 26 Aug 2022 at 22:36, Heiner Kallweit <hkallweit1@...il.com> wrote:

> On 26.08.2022 17:45, Da Xue wrote:
>> Hi Heiner,
>> 
>> I have been running with the patch reverted for about two weeks now
>> without issue but I have a modified u-boot with ethernet bringup
>> disabled.
>> 
>> If u-boot brings up ethernet, all of the GXL boards with more than 1GB
>> memory experience various bugs. I had to bring the PHY initialization
>> patch into Linux proper:
>> https://github.com/libre-computer-project/libretech-linux/commit/1a4004c11877d4239b57b182da1ce69a81c0150c
>> 
> Thanks for the follow-up. To be acceptable upstream I'm pretty sure that
> the maintainer is going to request replacing magic number 0x10110181
> with or'ing proper constants for the respective bits and fields.
>
>> Hope this helps someone.
>> 
>> Best,
>> 
>> Da
>> 
>> On Fri, Aug 26, 2022 at 5:51 AM Heiner Kallweit <hkallweit1@...il.com> wrote:
>>>
>>> On 07.07.2022 12:14, Jerome Brunet wrote:
>>>> For some reason, poking MAC_CTRL_REG a second time, even with the same
>>>> value, causes problem on a dwmac 3.70a.
>>>>
>>>> This problem happens on all the Amlogic SoCs, on link up, when the RMII
>>>> 10/100 internal interface is used. The problem does not happen on boards
>>>> using the external RGMII 10/100/1000 interface. Initially we suspected the
>>>> PHY to be the problem but after a lot of testing, the problem seems to be
>>>> coming from the MAC controller.
>>>>
>>>>> meson8b-dwmac c9410000.ethernet: IRQ eth_wake_irq not found
>>>>> meson8b-dwmac c9410000.ethernet: IRQ eth_lpi not found
>>>>> meson8b-dwmac c9410000.ethernet: PTP uses main clock
>>>>> meson8b-dwmac c9410000.ethernet: User ID: 0x11, Synopsys ID: 0x37
>>>>> meson8b-dwmac c9410000.ethernet:     DWMAC1000
>>>>> meson8b-dwmac c9410000.ethernet: DMA HW capability register supported
>>>>> meson8b-dwmac c9410000.ethernet: RX Checksum Offload Engine supported
>>>>> meson8b-dwmac c9410000.ethernet: COE Type 2
>>>>> meson8b-dwmac c9410000.ethernet: TX Checksum insertion supported
>>>>> meson8b-dwmac c9410000.ethernet: Wake-Up On Lan supported
>>>>> meson8b-dwmac c9410000.ethernet: Normal descriptors
>>>>> meson8b-dwmac c9410000.ethernet: Ring mode enabled
>>>>> meson8b-dwmac c9410000.ethernet: Enable RX Mitigation via HW Watchdog Timer
>>>>
>>>> The problem is not systematic. Its occurence is very random from 1/50 to
>>>> 1/2. It is fairly easy to detect by setting the kernel to boot over NFS and
>>>> possibly setting it to reboot automatically when reaching the prompt.
>>>>
>>>> When problem happens, the link is reported up by the PHY but no packet are
>>>> actually going out. DHCP requests eventually times out and the kernel reset
>>>> the interface. It may take several attempts but it will eventually work.
>>>>
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
>>>>> Sending DHCP requests ...... timed out!
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Link is Down
>>>>> IP-Config: Retrying forever (NFS root)...
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.1:08] driver [Meson G12A Internal PHY] (irq=POLL)
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rmii link mode
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
>>>>> Sending DHCP requests ...... timed out!
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Link is Down
>>>>> IP-Config: Retrying forever (NFS root)...
>>>>> [...] 5 retries ...
>>>>> IP-Config: Retrying forever (NFS root)...
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.1:08] driver [Meson G12A Internal PHY] (irq=POLL)
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rmii link mode
>>>>> meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
>>>>> Sending DHCP requests ., OK
>>>>> IP-Config: Got DHCP answer from 10.1.1.1, my address is 10.1.3.229
>>>>
>>>> Of course the same problem happens when not using NFS and it fairly
>>>> difficult for IoT products to detect this situation and recover.
>>>>
>>>> The call to stmmac_mac_set() should be no-op in our case, the bits it sets
>>>> have already been set by an earlier call to stmmac_mac_set(). However
>>>> removing this call solves the problem. We have no idea why or what is the
>>>> actual problem.
>>>>
>>>> Even weirder, keeping the call to stmmac_mac_set() but inserting a
>>>> udelay(1) between writel() and stmmac_mac_set() solves the problem too.
>>>>
>>>> Suggested-by: Qi Duan <qi.duan@...ogic.com>
>>>> Signed-off-by: Jerome Brunet <jbrunet@...libre.com>
>>>> ---
>>>>
>>>>  Hi,
>>>>
>>>>  There is no intention to get this patch merged as it is.
>>>>  It is sent with the hope to get a better understanding of the issue
>>>>  and more testing.
>>>>
>>>>  The discussion on this issue initially started on this thread
>>>>  https://lore.kernel.org/all/CAK4VdL3-BEBzgVXTMejrAmDjOorvoGDBZ14UFrDrKxVEMD2Zjg@mail.gmail.com/
>>>>
>>>>  The patches previously proposed in this thread have not solved the
>>>>  problem.
>>>>
>>>>  The line removed in this patch should be a no-op when it comes to the
>>>>  value of MAC_CTRL_REG. So the change should make not a difference but
>>>>  it does. Testing result have been very good so far so there must be an
>>>>  unexpected consequence on the HW. I hope that someone with more
>>>>  knowledge on this controller will be able to shine some light on this.
>>>>
>>>>  Cheers
>>>>  Jerome
>>>>
>>>>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 -
>>>>  1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> index d1a7cf4567bc..3dca3cc61f39 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> @@ -1072,7 +1072,6 @@ static void stmmac_mac_link_up(struct phylink_config *config,
>>>>
>>>>       writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
>>>>
>>>> -     stmmac_mac_set(priv, priv->ioaddr, true);
>>>>       if (phy && priv->dma_cap.eee) {
>>>>               priv->eee_active = phy_init_eee(phy, 1) >= 0;
>>>>               priv->eee_enabled = stmmac_eee_init(priv);
>>>
>>> Now that we have a3a57bf07de2 ("net: stmmac: work around sporadic tx issue on link-up")
>>> in linux-next and scheduled for stable:

I will

>>>
>>> Jerome, can you confirm that after this commit the following is no longer needed?
>>> 2c87c6f9fbdd ("net: phy: meson-gxl: improve link-up behavior")

This never had any meaningful impact for me. I have already reverted it
for testing.

I'm all for reverting it

>>>
>>> Then I'd revert it, referencing the successor workaround / fix in stmmac.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ