lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 Aug 2022 13:19:47 +0200
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Florian Fainelli <f.fainelli@...il.com>, netdev@...r.kernel.org,
        Steve Glendinning <steve.glendinning@...well.net>
Cc:     opendmb@...il.com, Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] net: phy: Warn about incorrect
 mdio_bus_phy_resume() state

Hi All,

On 02.08.2022 01:34, Florian Fainelli wrote:
> Calling mdio_bus_phy_resume() with neither the PHY state machine set to
> PHY_HALTED nor phydev->mac_managed_pm set to true is a good indication
> that we can produce a race condition looking like this:
>
> CPU0						CPU1
> bcmgenet_resume
>   -> phy_resume
>     -> phy_init_hw
>   -> phy_start
>     -> phy_resume
>                                                  phy_start_aneg()
> mdio_bus_phy_resume
>   -> phy_resume
>      -> phy_write(..., BMCR_RESET)
>       -> usleep()                                  -> phy_read()
>
> with the phy_resume() function triggering a PHY behavior that might have
> to be worked around with (see bf8bfc4336f7 ("net: phy: broadcom: Fix
> brcm_fet_config_init()") for instance) that ultimately leads to an error
> reading from the PHY.
>
> Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
> Signed-off-by: Florian Fainelli <f.fainelli@...il.com>

This patch, as probably intended, triggers a warning during system 
suspend/resume cycle in the SMSC911x driver. I've observed it on ARM 
Juno R1 board on the kernel compiled from next-202208010:

  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 398 at drivers/net/phy/phy_device.c:323 
mdio_bus_phy_resume+0x34/0xc8
  Modules linked in: smsc911x cpufreq_powersave cpufreq_conservative 
crct10dif_ce ip_tables x_tables ipv6 [last unloaded: smsc911x]
  CPU: 1 PID: 398 Comm: rtcwake Not tainted 5.19.0+ #940
  Hardware name: ARM Juno development board (r1) (DT)
  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : mdio_bus_phy_resume+0x34/0xc8
  lr : dpm_run_callback+0x74/0x350
  ...
  Call trace:
   mdio_bus_phy_resume+0x34/0xc8
   dpm_run_callback+0x74/0x350
   device_resume+0xb8/0x258
   dpm_resume+0x120/0x4a8
   dpm_resume_end+0x14/0x28
   suspend_devices_and_enter+0x164/0xa60
   pm_suspend+0x25c/0x3a8
   state_store+0x84/0x108
   kobj_attr_store+0x14/0x28
   sysfs_kf_write+0x60/0x70
   kernfs_fop_write_iter+0x124/0x1a8
   new_sync_write+0xd0/0x190
   vfs_write+0x208/0x478
   ksys_write+0x64/0xf0
   __arm64_sys_write+0x14/0x20
   invoke_syscall+0x40/0xf8
   el0_svc_common.constprop.3+0x8c/0x120
   do_el0_svc+0x28/0xc8
   el0_svc+0x48/0xd0
   el0t_64_sync_handler+0x94/0xb8
   el0t_64_sync+0x15c/0x160
  irq event stamp: 24406
  hardirqs last  enabled at (24405): [<ffff8000090c4734>] 
_raw_spin_unlock_irqrestore+0x8c/0x90
  hardirqs last disabled at (24406): [<ffff8000090b3164>] el1_dbg+0x24/0x88
  softirqs last  enabled at (24144): [<ffff800008010488>] _stext+0x488/0x5cc
  softirqs last disabled at (24139): [<ffff80000809bf98>] 
irq_exit_rcu+0x168/0x1a8
  ---[ end trace 0000000000000000 ]---

I hope the above information will help fixing the driver.

> ---
>   drivers/net/phy/phy_device.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 46acddd865a7..608de5a94165 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -316,6 +316,12 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
>   
>   	phydev->suspended_by_mdio_bus = 0;
>   
> +	/* If we managed to get here with the PHY state machine in a state other
> +	 * than PHY_HALTED this is an indication that something went wrong and
> +	 * we should most likely be using MAC managed PM and we are not.
> +	 */
> +	WARN_ON(phydev->state != PHY_HALTED && !phydev->mac_managed_pm);
> +
>   	ret = phy_init_hw(phydev);
>   	if (ret < 0)
>   		return ret;

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ