[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81c0f21f-f8f1-f7b3-c52f-c6a564c6a445@samsung.com>
Date: Mon, 29 Aug 2022 13:40:05 +0200
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>, netdev@...r.kernel.org,
linux-usb@...r.kernel.org,
Steve Glendinning <steve.glendinning@...well.net>,
UNGLinuxDriver@...rochip.com, Oliver Neukum <oneukum@...e.com>,
Andre Edich <andre.edich@...rochip.com>,
Oleksij Rempel <linux@...pel-privat.de>,
Martyn Welch <martyn.welch@...labora.com>,
Gabriel Hojda <ghojda@...urs.ro>,
Christoph Fritz <chf.fritz@...glemail.com>,
Lino Sanfilippo <LinoSanfilippo@....de>,
Philipp Rosenberger <p.rosenberger@...bus.com>,
Heiner Kallweit <hkallweit1@...il.com>,
Andrew Lunn <andrew@...n.ch>,
Russell King <linux@...linux.org.uk>,
Ferry Toth <fntoth@...il.com>,
Krzysztof Kozlowski <krzk@...nel.org>,
'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>
Subject: Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY
interrupts to PHY driver to avoid polling
Hi All,
On 26.08.2022 10:29, Marek Szyprowski wrote:
> On 26.08.2022 09:53, Lukas Wunner wrote:
>> On Fri, Aug 26, 2022 at 09:41:46AM +0200, Marek Szyprowski wrote:
>>> On 26.08.2022 09:19, Lukas Wunner wrote:
>>>> On Fri, Aug 26, 2022 at 08:51:58AM +0200, Marek Szyprowski wrote:
>>>>> On 19.05.2022 23:22, Marek Szyprowski wrote:
>>>>>> On 19.05.2022 21:08, Lukas Wunner wrote:
>>>>>>> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
>>>>>>>> This patch landed in the recent linux next-20220516 as commit
>>>>>>>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY
>>>>>>>> driver to
>>>>>>>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet
>>>>>>>> operation
>>>>>>>> after system suspend-resume cycle. On the Odroid XU3 board I
>>>>>>>> got the
>>>>>>>> following warning in the kernel log:
>>>>>>>>
>>>>>>>> # time rtcwake -s10 -mmem
>>>>>>>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17
>>>>>>>> 09:16:07 2022
>>>>>>>> PM: suspend entry (deep)
>>>>>>>> Filesystems sync: 0.001 seconds
>>>>>>>> Freezing user space processes ... (elapsed 0.002 seconds) done.
>>>>>>>> OOM killer disabled.
>>>>>>>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds)
>>>>>>>> done.
>>>>>>>> printk: Suspending console(s) (use no_console_suspend to debug)
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
>>>>>>>> ------------[ cut here ]------------
>>>>>>>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
>>>>>>>> phy_state_machine+0x98/0x28c
>>>>>>> [...]
>>>>>>>> It looks that the driver's suspend/resume operations might need
>>>>>>>> some
>>>>>>>> adjustments. After the system suspend/resume cycle the driver
>>>>>>>> is not
>>>>>>>> operational anymore. Reverting the $subject patch on top of linux
>>>>>>>> next-20220516 restores ethernet operation after system
>>>>>>>> suspend/resume.
>>>>>>> Thanks a lot for the report. It seems the PHY is signaling a link
>>>>>>> change
>>>>>>> shortly before system sleep and by the time the phy_state_machine()
>>>>>>> worker
>>>>>>> gets around to handle it, the device has already been suspended
>>>>>>> and thus
>>>>>>> refuses any further USB requests with -EHOSTUNREACH (-113):
>>>> [...]
>>>>>>> Assuming the above theory is correct, calling phy_stop_machine()
>>>>>>> after usbnet_suspend() would be sufficient to fix the issue.
>>>>>>> It cancels the phy_state_machine() worker.
>>>>>>>
>>>>>>> The small patch below does that. Could you give it a spin?
>>>>>> That's it. Your analysis is right and the patch fixes the issue.
>>>>>> Thanks!
>>>>>>
>>>>>> Feel free to add:
>>>>>>
>>>>>> Reported-by: Marek Szyprowski <m.szyprowski@...sung.com>
>>>>>>
>>>>>> Tested-by: Marek Szyprowski <m.szyprowski@...sung.com>
>>>>> Gentle ping for the final patch...
>>>> Hm? Actually this issue is supposed to be fixed by mainline commit
>>>> 1758bde2e4aa ("net: phy: Don't trigger state machine while in
>>>> suspend").
>>>>
>>>> The initial fix attempt that you're replying to should not be
>>>> necessary
>>>> with that commit.
>>>>
>>>> Are you still seeing issues even with 1758bde2e4aa applied?
>>>> Or are you maybe using a custom downstream tree which is missing
>>>> that commit?
>>> On Linux next-20220825 I still get the following warning during
>>> suspend/resume cycle:
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 0 PID: 1483 at drivers/net/phy/phy_device.c:323
>>> mdio_bus_phy_resume+0x10c/0x110
>>> Modules linked in: exynos_gsc s5p_jpeg s5p_mfc videobuf2_dma_contig
>>> v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
>>> mc s5p_cec
>>> CPU: 0 PID: 1483 Comm: rtcwake Not tainted 6.0.0-rc2-next-20220825
>>> #5482
>>> Hardware name: Samsung Exynos (Flattened Device Tree)
>>> unwind_backtrace from show_stack+0x10/0x14
>>> show_stack from dump_stack_lvl+0x58/0x70
>>> dump_stack_lvl from __warn+0xc8/0x220
>>> __warn from warn_slowpath_fmt+0x5c/0xb4
>>> warn_slowpath_fmt from mdio_bus_phy_resume+0x10c/0x110
>>> mdio_bus_phy_resume from dpm_run_callback+0x94/0x208
>>> dpm_run_callback from device_resume+0x124/0x21c
>>> device_resume from dpm_resume+0x108/0x278
>>> dpm_resume from dpm_resume_end+0xc/0x18
>>> dpm_resume_end from suspend_devices_and_enter+0x208/0x70c
>>> suspend_devices_and_enter from pm_suspend+0x364/0x430
>>> pm_suspend from state_store+0x68/0xc8
>>> state_store from kernfs_fop_write_iter+0x110/0x1d4
>>> kernfs_fop_write_iter from vfs_write+0x1c4/0x2ac
>>> vfs_write from ksys_write+0x5c/0xd4
>>> ksys_write from ret_fast_syscall+0x0/0x1c
>>> Exception stack(0xf2ee5fa8 to 0xf2ee5ff0)
>>> 5fa0: 00000004 0002b438 00000004 0002b438 00000004
>>> 00000000
>>> 5fc0: 00000004 0002b438 000291b0 00000004 0002b438 00000004 befd9c1c
>>> 00028160
>>> 5fe0: 0000006c befd9ae8 b6eb4148 b6f118a4
>>> irq event stamp: 58381
>>> hardirqs last enabled at (58393): [<c019ff28>]
>>> vprintk_emit+0x320/0x344
>>> hardirqs last disabled at (58400): [<c019fedc>]
>>> vprintk_emit+0x2d4/0x344
>>> softirqs last enabled at (58258): [<c0101694>]
>>> __do_softirq+0x354/0x618
>>> softirqs last disabled at (58247): [<c012dd18>]
>>> __irq_exit_rcu+0x140/0x1ec
>>> ---[ end trace 0000000000000000 ]---
>>>
>>> The mentioned patch fixes it.
>> Color me confused.
>>
>> With "the mentioned patch", are you referring to 1758bde2e4aa
>> or are you referring to the little test patch in this e-mail:
>> https://lore.kernel.org/netdev/20220519190841.GA30869@wunner.de/
>>
>> There's a Tested-by from you on 1758bde2e4aa, so I assume the
>> commit fixed the issue at the time. Does it still occur
>> intermittently? Or does it occur every time? In the latter case,
>> I'd assume some other change was made in the meantime and that
>> other change exposed the issue again...
>
> The issue seems to be exposed again. On 1758bde2e4aa everything works
> fine and there is no warning during suspend/resume cycle. Then I've
> noticed that warning again while playing with current linux-next. I
> will check when it got broken and report again.
I've finally traced what has happened. I've double checked and indeed
the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as
such it has been merged to linus tree. Then the commit 744d23c71af3
("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been
merged to linus tree, which triggers a new warning during the
suspend/resume cycle with smsc95xx driver. Please note, that the
smsc95xx still works fine regardless that warning. However it look that
the commit 1758bde2e4aa only hide a real problem, which the commit
744d23c71af3 warns about.
Probably a proper fix for smsc95xx driver is to call phy_stop/start
during suspend/resume cycle, like in similar patches for other drivers:
https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists