netdev - Re: [RFC net-next 4/4] net: phy: Correctly process PHY_HALTED in phy_stop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdXcWRAxpFejgKE5EksGbLUK80wDo2kjmc0rxyukhahMHA@mail.gmail.com>
Date:   Mon, 27 Nov 2017 08:48:11 +0100
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Marc Gonzalez <marc_gonzalez@...madesigns.com>,
        "David S. Miller" <davem@...emloft.net>,
        Andrew Lunn <andrew@...n.ch>, opendmb@...il.com,
        Mason <slash.tmp@...e.fr>, David Daney <david.daney@...ium.com>,
        Geert Uytterhoeven <geert+renesas@...der.be>
Subject: Re: [RFC net-next 4/4] net: phy: Correctly process PHY_HALTED in phy_stop_machine()

Hi Florian,

On Mon, Nov 27, 2017 at 5:05 AM, Florian Fainelli <f.fainelli@...il.com> wrote:
> On 11/06/2017 07:50 AM, Geert Uytterhoeven wrote:
>> On Tue, Oct 31, 2017 at 5:33 PM, Florian Fainelli <f.fainelli@...il.com> wrote:
>>> On 10/31/2017 08:26 AM, Geert Uytterhoeven wrote:
>>>> On Mon, Oct 30, 2017 at 5:09 PM, Florian Fainelli <f.fainelli@...il.com> wrote:
>>>>> On 10/30/2017 06:56 AM, Geert Uytterhoeven wrote:
>>>>>> On Thu, Oct 26, 2017 at 1:21 AM, Florian Fainelli <f.fainelli@...il.com> wrote:
>>>>>>> Marc reported that he was not getting the PHY library adjust_link()
>>>>>>> callback function to run when calling phy_stop() + phy_disconnect()
>>>>>>> which does not indeed happen because we set the state machine to
>>>>>>> PHY_HALTED but we don't get to run it to process this state past that
>>>>>>> point.
>>>>>>>
>>>>>>> Fix this with a synchronous call to phy_state_machine() in order to have
>>>>>>> the state machine actually act on PHY_HALTED, set the PHY device's link
>>>>>>> down, turn the network device's carrier off and finally call the
>>>>>>> adjust_link() function.
>>>>>>>
>>>>>>> At the end of phy_state_machine() though, if we are going to be moving
>>>>>>> from PHY_HALTED to PHY_HALTED, do not reschedule the state machine, this
>>>>>>> is pointless.
>>>>>>>
>>>>>>> Reported-by: Marc Gonzalez <marc_gonzalez@...madesigns.com>
>>>>>>> Fixes: a390d1f379cf ("phylib: convert state_queue work to delayed_work")
>>>>>>> Signed-off-by: Marc Gonzalez <marc_gonzalez@...madesigns.com>
>>>>>>> Signed-off-by: Florian Fainelli <f.fainelli@...il.com>
>>>>>>
>>>>>> Thanks for your patch!
>>>>>>
>>>>>> Unfortunately, after applying this one, the last in your series, both
>>>>>> sh73a0/kzm9g and r8a73a4/ape6evm start crashing again in the system
>>>>>> suspend/resume path, due to register accesses while the device is already
>>>>>> suspended:
>>>>>
>>>>> OK, seems like there is another path, uncovered by this patch that we
>>>>> can be hitting, does the following patch below help?
>>>>
>>>> Unfortunately it doesn't help.
>>>
>>> OK :/
>>>
>>>>
>>>>>> Unhandled fault: imprecise external abort (0x1406) at 0x0005b950
>>>>
>>>> Note that this is an imprecise external abort, i.e. it's reporting may
>>>> be delayed,
>>>> and the backtrace may be inaccurate.
>>>
>>> True, can you help narrow it down with me? Can you confirm that
>>> adjust_link() (assuming that is the problem) does not get called past
>>> phy_stop_machine() as it should?
>>
>> I've added some additional debug checks (keep track of both phy and
>> smsc state, and refuse the access registers if smsc is disabled).
>
> Thanks for doing that, and sorry for responding that late.
>
>>
>> Apparently phy_stop_machine() is called twice:
>>   - Once from mdio_bus_phy_suspend(), cfr. the first backtrace,
>>   - A second time from smsc911x_suspend(), cfr. the second backtrace.
>>
>> The second call causes a call to smsc911x_phy_adjust_link() while the smsc is
>> already disabled, cfr. the third backtrace. This would trigger the imprecise
>> external abort if I let it access the registers.
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1083 at drivers/net/phy/phy.c:597
>> phy_stop_machine+0x44/0xcc
>> phy_stop_machine: phy running, good
>> CPU: 0 PID: 1083 Comm: bash Not tainted
>> 4.14.0-rc7-ape6evm-00443-gcdfc0e18a47e0bb3-dirty #637
>> Hardware name: Generic R8A73A4 (Flattened Device Tree)
>> [<c0018f3c>] (unwind_backtrace) from [<c0014f40>] (show_stack+0x10/0x14)
>> [<c0014f40>] (show_stack) from [<c04bd928>] (dump_stack+0xa4/0xdc)
>> [<c04bd928>] (dump_stack) from [<c002ac28>] (__warn+0xcc/0xfc)
>> [<c002ac28>] (__warn) from [<c002ac8c>] (warn_slowpath_fmt+0x34/0x44)
>> [<c002ac8c>] (warn_slowpath_fmt) from [<c0314680>] (phy_stop_machine+0x44/0xcc)
>> [<c0314680>] (phy_stop_machine) from [<c03162ec>]
>> (mdio_bus_phy_suspend+0x24/0x40)
>> [<c03162ec>] (mdio_bus_phy_suspend) from [<c0307090>]
>> (dpm_run_callback+0x17c/0x3ec)
>> [<c0307090>] (dpm_run_callback) from [<c030a584>] (__device_suspend+0x498/0x6b0)
>> [<c030a584>] (__device_suspend) from [<c030a974>] (dpm_suspend+0x1d8/0x568)
>> [<c030a974>] (dpm_suspend) from [<c00880bc>]
>> (suspend_devices_and_enter+0x78/0xe98)
>> [<c00880bc>] (suspend_devices_and_enter) from [<c008991c>]
>> (pm_suspend+0xa40/0xbec)
>> [<c008991c>] (pm_suspend) from [<c0086e68>] (state_store+0xac/0xcc)
>> [<c0086e68>] (state_store) from [<c01c7780>] (kernfs_fop_write+0x190/0x1d0)
>> [<c01c7780>] (kernfs_fop_write) from [<c0153418>] (__vfs_write+0x20/0x11c)
>> [<c0153418>] (__vfs_write) from [<c0153688>] (vfs_write+0xb8/0x144)
>> [<c0153688>] (vfs_write) from [<c0153808>] (SyS_write+0x40/0x80)
>> [<c0153808>] (SyS_write) from [<c0010200>] (ret_fast_syscall+0x0/0x28)
>> ---[ end trace 8fc4c71351438007 ]---
>> libphy: phy_stop_machine: Kicking state machine synchronously
>> libphy: phy_stop_machine: Kicking state machine done
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1083 at drivers/net/phy/phy.c:598
>> phy_stop_machine+0x64/0xcc
>> phy_stop_machine: phy already stopped
>> CPU: 0 PID: 1083 Comm: bash Tainted: G        W
>> 4.14.0-rc7-ape6evm-00443-gcdfc0e18a47e0bb3-dirty #637
>> Hardware name: Generic R8A73A4 (Flattened Device Tree)
>> [<c0018f3c>] (unwind_backtrace) from [<c0014f40>] (show_stack+0x10/0x14)
>> [<c0014f40>] (show_stack) from [<c04bd928>] (dump_stack+0xa4/0xdc)
>> [<c04bd928>] (dump_stack) from [<c002ac28>] (__warn+0xcc/0xfc)
>> [<c002ac28>] (__warn) from [<c002ac8c>] (warn_slowpath_fmt+0x34/0x44)
>> [<c002ac8c>] (warn_slowpath_fmt) from [<c03146a0>] (phy_stop_machine+0x64/0xcc)
>> [<c03146a0>] (phy_stop_machine) from [<c031a448>] (smsc911x_suspend+0x44/0xa4)
>> [<c031a448>] (smsc911x_suspend) from [<c0307090>] (dpm_run_callback+0x17c/0x3ec)
>> [<c0307090>] (dpm_run_callback) from [<c030a584>] (__device_suspend+0x498/0x6b0)
>> [<c030a584>] (__device_suspend) from [<c030a974>] (dpm_suspend+0x1d8/0x568)
>> [<c030a974>] (dpm_suspend) from [<c00880bc>]
>> (suspend_devices_and_enter+0x78/0xe98)
>> [<c00880bc>] (suspend_devices_and_enter) from [<c008991c>]
>> (pm_suspend+0xa40/0xbec)
>> [<c008991c>] (pm_suspend) from [<c0086e68>] (state_store+0xac/0xcc)
>> [<c0086e68>] (state_store) from [<c01c7780>] (kernfs_fop_write+0x190/0x1d0)
>> [<c01c7780>] (kernfs_fop_write) from [<c0153418>] (__vfs_write+0x20/0x11c)
>> [<c0153418>] (__vfs_write) from [<c0153688>] (vfs_write+0xb8/0x144)
>> [<c0153688>] (vfs_write) from [<c0153808>] (SyS_write+0x40/0x80)
>> [<c0153808>] (SyS_write) from [<c0010200>] (ret_fast_syscall+0x0/0x28)
>> ---[ end trace 8fc4c71351438008 ]---
>> libphy: phy_stop_machine: Kicking state machine synchronously
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1083 at drivers/net/ethernet/smsc/smsc911x.c:988
>> smsc911x_phy_adjust_link+0x2c/0x2e0
>> PHY stopped
>> CPU: 0 PID: 1083 Comm: bash Tainted: G        W
>> 4.14.0-rc7-ape6evm-00443-gcdfc0e18a47e0bb3-dirty #637
>> Hardware name: Generic R8A73A4 (Flattened Device Tree)
>> [<c0018f3c>] (unwind_backtrace) from [<c0014f40>] (show_stack+0x10/0x14)
>> [<c0014f40>] (show_stack) from [<c04bd928>] (dump_stack+0xa4/0xdc)
>> [<c04bd928>] (dump_stack) from [<c002ac28>] (__warn+0xcc/0xfc)
>> [<c002ac28>] (__warn) from [<c002ac8c>] (warn_slowpath_fmt+0x34/0x44)
>> [<c002ac8c>] (warn_slowpath_fmt) from [<c031b1e8>]
>> (smsc911x_phy_adjust_link+0x2c/0x2e0)
>> [<c031b1e8>] (smsc911x_phy_adjust_link) from [<c031378c>]
>> (phy_link_down+0x18/0x24)
>> [<c031378c>] (phy_link_down) from [<c0314518>] (phy_state_machine+0x2d0/0x3f4)
>> [<c0314518>] (phy_state_machine) from [<c03146d8>] (phy_stop_machine+0x9c/0xcc)
>> [<c03146d8>] (phy_stop_machine) from [<c031a448>] (smsc911x_suspend+0x44/0xa4)
>> [<c031a448>] (smsc911x_suspend) from [<c0307090>] (dpm_run_callback+0x17c/0x3ec)
>> [<c0307090>] (dpm_run_callback) from [<c030a584>] (__device_suspend+0x498/0x6b0)
>> [<c030a584>] (__device_suspend) from [<c030a974>] (dpm_suspend+0x1d8/0x568)
>> [<c030a974>] (dpm_suspend) from [<c00880bc>]
>> (suspend_devices_and_enter+0x78/0xe98)
>> [<c00880bc>] (suspend_devices_and_enter) from [<c008991c>]
>> (pm_suspend+0xa40/0xbec)
>> [<c008991c>] (pm_suspend) from [<c0086e68>] (state_store+0xac/0xcc)
>> [<c0086e68>] (state_store) from [<c01c7780>] (kernfs_fop_write+0x190/0x1d0)
>> [<c01c7780>] (kernfs_fop_write) from [<c0153418>] (__vfs_write+0x20/0x11c)
>> [<c0153418>] (__vfs_write) from [<c0153688>] (vfs_write+0xb8/0x144)
>> [<c0153688>] (vfs_write) from [<c0153808>] (SyS_write+0x40/0x80)
>> [<c0153808>] (SyS_write) from [<c0010200>] (ret_fast_syscall+0x0/0x28)
>> ---[ end trace 8fc4c71351438009 ]---
>> libphy: phy_stop_machine: Kicking state machine done
>>
>> If I revert your "net: smsc911x: Properly manage PHY during suspend/resume",
>> phy_stop_machine() is no longer called twice, and system suspend works.
>
> OK, there does appear to be a problem in how the network device vs. mdio
> bus are suspended/resumed in this particular driver, and I have no idea
> why, see below.
>
>>
>> However, during resume, smsc911x_mii_read() is called before the
>> smsc is enabled, cfr. the fourth backtrace:
>
> Humm, how is that possible? smsc911x_mii_bus properly sets its parent
> device to be the platform device of the network device (which is
> correct) so by virtue of child -> parent relationships, we should have
> the network device resume first, and then have the MDIO bus resume
> second (unless I am completely off here).

The MDIO bus callback is not called from the network device ...

>>      WARNING: CPU: 1 PID: 17 at
>> drivers/net/ethernet/smsc/smsc911x.c:165 __smsc911x_reg_read+0x1c/0x88
>>     Modules linked in:
>>     CPU: 1 PID: 17 Comm: kworker/1:0 Tainted: G        W
>> 4.14.0-rc7-kzm9g-00443-gcdfc0e18a47e0bb3-dirty #1013
>>     Hardware name: Generic SH73A0 (Flattened Device Tree)
>>
>> smsc hangs of [fec10000.bus, which is started only here --->
>>
>>     Workqueue: events_power_efficient phy_state_machine
>>     [<c010f304>] (unwind_backtrace) from [<c010b7ec>] (show_stack+0x10/0x14)
>>     [<c010b7ec>] (show_stack) from [<c0551ab0>] (dump_stack+0xa4/0xdc)
>>     [<c0551ab0>] (dump_stack) from [<c0120480>] (__warn+0xcc/0xfc)
>>     [<c0120480>] (__warn) from [<c0120554>] (warn_slowpath_null+0x1c/0x24)
>>     [<c0120554>] (warn_slowpath_null) from [<c03ce264>]
>> (__smsc911x_reg_read+0x1c/0x88)
>>     [<c03ce264>] (__smsc911x_reg_read) from [<c03cefc8>]
>> (smsc911x_mac_read+0x4c/0x118)
>>     [<c03cefc8>] (smsc911x_mac_read) from [<c03cf320>]
>> (smsc911x_mii_read+0x2c/0xa4)
>>     [<c03cf320>] (smsc911x_mii_read) from [<c03cd490>] (mdiobus_read+0x58/0x70)
>>     [<c03cd490>] (mdiobus_read) from [<c03cc4a0>] (genphy_update_link+0x18/0x50)
>>     [<c03cc4a0>] (genphy_update_link) from [<c03cc4e4>]
>> (genphy_read_status+0xc/0x1cc)
>>     [<c03cc4e4>] (genphy_read_status) from [<c03ca3e4>]
>> (phy_state_machine+0xa8/0x3f4)
>>     [<c03ca3e4>] (phy_state_machine) from [<c013a740>]
>> (process_one_work+0x240/0x3fc)
>>     [<c013a740>] (process_one_work) from [<c013af28>]
>> (worker_thread+0x2cc/0x40c)
>>     [<c013af28>] (worker_thread) from [<c014024c>] (kthread+0x124/0x144)

... but from the worker thread, which is unaware of PM states, unless it is
a freezable workqueue.

Cfr. my patch "[1/2] net: phy: Freeze PHY polling before suspending devices"
(https://patchwork.kernel.org/patch/9915901/), which made it a freezable
workqueue, like was done on PCI to solve similar issues with PME scanning.

>>     [<c014024c>] (kthread) from [<c01071c8>] (ret_from_fork+0x14/0x2c)
>>     ---[ end trace 21b7024e273f9f21 ]---
>>     ------------[ cut here ]------------
>>
>> (yes, the fourth backtrace is from another machine, but I can trigger
>> all of this
>>  on both r8a73a4/ape6evm and sh73a0/kzm9g anyway).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds