lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 Aug 2022 09:32:15 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Marek Szyprowski <m.szyprowski@...sung.com>,
        netdev@...r.kernel.org,
        Steve Glendinning <steve.glendinning@...well.net>
Cc:     opendmb@...il.com, Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] net: phy: Warn about incorrect mdio_bus_phy_resume()
 state

On 8/12/22 04:19, Marek Szyprowski wrote:
> Hi All,
> 
> On 02.08.2022 01:34, Florian Fainelli wrote:
>> Calling mdio_bus_phy_resume() with neither the PHY state machine set to
>> PHY_HALTED nor phydev->mac_managed_pm set to true is a good indication
>> that we can produce a race condition looking like this:
>>
>> CPU0						CPU1
>> bcmgenet_resume
>>    -> phy_resume
>>      -> phy_init_hw
>>    -> phy_start
>>      -> phy_resume
>>                                                   phy_start_aneg()
>> mdio_bus_phy_resume
>>    -> phy_resume
>>       -> phy_write(..., BMCR_RESET)
>>        -> usleep()                                  -> phy_read()
>>
>> with the phy_resume() function triggering a PHY behavior that might have
>> to be worked around with (see bf8bfc4336f7 ("net: phy: broadcom: Fix
>> brcm_fet_config_init()") for instance) that ultimately leads to an error
>> reading from the PHY.
>>
>> Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
>> Signed-off-by: Florian Fainelli <f.fainelli@...il.com>
> 
> This patch, as probably intended, triggers a warning during system
> suspend/resume cycle in the SMSC911x driver. I've observed it on ARM
> Juno R1 board on the kernel compiled from next-202208010:
> 
>    ------------[ cut here ]------------
>    WARNING: CPU: 1 PID: 398 at drivers/net/phy/phy_device.c:323
> mdio_bus_phy_resume+0x34/0xc8
>    Modules linked in: smsc911x cpufreq_powersave cpufreq_conservative
> crct10dif_ce ip_tables x_tables ipv6 [last unloaded: smsc911x]
>    CPU: 1 PID: 398 Comm: rtcwake Not tainted 5.19.0+ #940
>    Hardware name: ARM Juno development board (r1) (DT)
>    pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>    pc : mdio_bus_phy_resume+0x34/0xc8
>    lr : dpm_run_callback+0x74/0x350
>    ...
>    Call trace:
>     mdio_bus_phy_resume+0x34/0xc8
>     dpm_run_callback+0x74/0x350
>     device_resume+0xb8/0x258
>     dpm_resume+0x120/0x4a8
>     dpm_resume_end+0x14/0x28
>     suspend_devices_and_enter+0x164/0xa60
>     pm_suspend+0x25c/0x3a8
>     state_store+0x84/0x108
>     kobj_attr_store+0x14/0x28
>     sysfs_kf_write+0x60/0x70
>     kernfs_fop_write_iter+0x124/0x1a8
>     new_sync_write+0xd0/0x190
>     vfs_write+0x208/0x478
>     ksys_write+0x64/0xf0
>     __arm64_sys_write+0x14/0x20
>     invoke_syscall+0x40/0xf8
>     el0_svc_common.constprop.3+0x8c/0x120
>     do_el0_svc+0x28/0xc8
>     el0_svc+0x48/0xd0
>     el0t_64_sync_handler+0x94/0xb8
>     el0t_64_sync+0x15c/0x160
>    irq event stamp: 24406
>    hardirqs last  enabled at (24405): [<ffff8000090c4734>]
> _raw_spin_unlock_irqrestore+0x8c/0x90
>    hardirqs last disabled at (24406): [<ffff8000090b3164>] el1_dbg+0x24/0x88
>    softirqs last  enabled at (24144): [<ffff800008010488>] _stext+0x488/0x5cc
>    softirqs last disabled at (24139): [<ffff80000809bf98>]
> irq_exit_rcu+0x168/0x1a8
>    ---[ end trace 0000000000000000 ]---
> 
> I hope the above information will help fixing the driver.

Yes this is catching an actual issue in the driver in that the PHY state 
machine is still running while the system is trying to suspend. We could 
go about fixing it in a different number of ways, though I believe this 
one is probably correct enough to work and fix the warning:

diff --git a/drivers/net/ethernet/smsc/smsc911x.c 
b/drivers/net/ethernet/smsc/smsc911x.c
index 3bf20211cceb..e9c0668a4dc0 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -1037,6 +1037,8 @@ static int smsc911x_mii_probe(struct net_device *dev)
                 return ret;
         }

+       /* Indicate that the MAC is responsible for managing PHY PM */
+       phydev->mac_managed_pm = true;
         phy_attached_info(phydev);

         phy_set_max_speed(phydev, SPEED_100);
@@ -2587,6 +2589,8 @@ static int smsc911x_suspend(struct device *dev)
         if (netif_running(ndev)) {
                 netif_stop_queue(ndev);
                 netif_device_detach(ndev);
+               if (!device_may_wakeup(dev))
+                       phy_suspend(dev->phydev);
         }

         /* enable wake on LAN, energy detection and the external PME
@@ -2628,6 +2632,8 @@ static int smsc911x_resume(struct device *dev)
         if (netif_running(ndev)) {
                 netif_device_attach(ndev);
                 netif_start_queue(ndev);
+               if (!device_may_wakeup(dev))
+                       phy_resume(dev->phydev);
         }

         return 0;

-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ