netdev - Re: handling of phy_stop() and phy_stop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180207211306.GL9418@n2100.armlinux.org.uk>
Date:   Wed, 7 Feb 2018 21:13:07 +0000
From:   Russell King - ARM Linux <linux@...linux.org.uk>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: handling of phy_stop() and phy_stop_machine() in phylib

On Wed, Feb 07, 2018 at 09:56:37PM +0100, Heiner Kallweit wrote:
> Am 04.02.2018 um 03:48 schrieb Florian Fainelli:
> > 
> > 
> > On 02/03/2018 03:58 PM, Heiner Kallweit wrote:
> >> Am 03.02.2018 um 21:17 schrieb Andrew Lunn:
> >>> On Sat, Feb 03, 2018 at 05:41:54PM +0100, Heiner Kallweit wrote:
> >>>> This commit forces callers of phy_resume() and phy_suspend() to hold
> >>>> mutex phydev->lock. This was done for calls to phy_resume() and
> >>>> phy_suspend() in phylib, however there are more callers in network
> >>>> drivers. I'd assume that these other calls issue a warning now
> >>>> because of the lock not being held.
> >>>> So is there something I miss or would this have to be fixed?
> >>>
> >>> Hi Heiner
> >>>
> >>> This is a good point.
> >>>
> >>> Yes, it looks like some fixes are needed. But what exactly?
> >>>
> >>> The phy state machine will suspend and resume the phy is you call
> >>> phy_stop() and phy_start() in the MAC suspend and resume functions.
> >>>
> >> AFAICS phy_stop() doesn't suspend the PHY. It just sets the state
> >> to PHY_HALTED and (at least if we're not in polling mode) doesn't
> >> call phy_suspend(). Maybe a call to phy_trigger_machine() is
> >> needed like in phy_start() ? Then the state machine would call
> >> phy_suspend(), provided the link is still up.
> > 
> > Right, phy_stop() merely just moves the state machine to PHY_HALTED and
> > this is actually a great source of problems which I tried to address here:
> > 
> > https://www.mail-archive.com/netdev@vger.kernel.org/msg196061.html
> > 
> > because phy_stop() is not a synchronous call, so when it returns the
> > state machine might still be running (it can take up to a 1 HZ depending
> > on when you called phy_stop()) and so if you took that as a
> > synchronization point to e.g: turn of your Ethernet MAC/MDIO bus clocks,
> > you will likely see problems. phy_stop_machine() would provide that
> > synchronization point, but is not currently exported, despite being a
> > global symbol. This patch series above is all well and good, except that
> > Geert reported issues with suspend/resume interactions which I have not
> > been able to track down.
> > 
> To not confuse readers I changed the subject of the mail to reflect that
> the discussion isn't about the original topic any longer.
> 
> It seems to me that (at least one) reason for the issues is that pm
> callbacks for the phy device and the network device interfere.
> phy device pm (mdio_bus_phy_suspend et al) feels responsible for dealing
> with suspending/resuming the PHY, and the network driver pm callbacks
> as well.
> 
> Maybe, if the network driver takes care, it should inform phy device pm
> to do nothing. For this we could add a flag to phydev.mdio.flags.
> If the network driver sets this flag then mdio_bus_phy_suspend()
> and mdio_bus_phy_resume() could turn into no-ops.
> Not totally sure yet about mdio_bus_phy_restore() ..

What if the MDIO bus is handled by a separate device and the MDIO bus
is suspended prior to the network driver, thereby making the PHY
inaccessible?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up