netdev - Re: [BUG] micrel phy suspend/resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <389b78f6-951c-528d-8b2b-fc6c6adfc77f@gmail.com>
Date:   Sun, 10 Dec 2017 15:50:28 -0800
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Russell King - ARM Linux <linux@...linux.org.uk>,
        Andrew Lunn <andrew@...n.ch>
Cc:     netdev@...r.kernel.org
Subject: Re: [BUG] micrel phy suspend/resume

Hi Russell,

On 12/10/2017 08:47 AM, Russell King - ARM Linux wrote:
> Guys,
> 
> I've just tripped over a bug with the Micrel PHY driver, but it
> really isn't specific to the Micrel PHY driver.
> 
> When we suspend, we suspend the PHY and then the MAC driver (eg,
> on the ZII board):
> 
> [  198.822751] 400d0000.ethernet-1:00: bus : mdio_bus_suspend+0x0/0x34
> [  198.822859] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> [  198.822878] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3900
> ...
> [  198.826235] 400d0000.ethernet: bus : platform_pm_freeze+0x0/0x5c
> [  198.826354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> [  198.826374] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> [  198.826503] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0000
> [  198.826699] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> 
> When we resume, the order is reversed:
> 
> [  198.848300] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> [  198.849024] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> [  198.849120] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> [  198.849141] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> [  198.849243] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> [  198.849401] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> [  198.849419] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> [  198.849637] __mdiobus_read: 61 400d0000.ethernet-1 00 01 => 7849
> ...
> [  198.852677] 400d0000.ethernet-1:00: bus : mdio_bus_resume+0x0/0x34
> [  198.852778] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> [  198.852797] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> 
> Now, the MAC driver calls phy_stop() and phy_start() from within its
> own suspend/resume methods, and at this is done while the PHY is,
> as far as the kernel PM code is concerned, suspended.
> 
> However, phylib works around that by resuming the PHY itself when
> phy_start() is called in this situation.  That looks good, but it
> really isn't in this case.  Given the above sequence, we will be in
> PHY_HALTED state.
> 
> So, when phy_start() is called, the first thing it does is check the
> state, and finds PHY_HALTED.  It then tries to enable the PHY
> interrupts.  This cause the Micrel driver to write to the PHY to
> enable interrupts - it reads 0x1b to ack any pre-existing interrupt,
> modifies 0x1f to set the interrupt pin level, and then supposedly
> writes 0x1b to enable interrupts.
> 
> However, at this point, the PHY is still powered down, as we can see
> in the following read of the BMCR - containing 0x3900.  The write to
> enable interrupts is ignored, and so interrupts remain disabled.
> 
> The resume process continues, and the system resumes, but interrupts
> on the PHY remain disabled, and the phylib state machine never
> advances.  You can do anything you like with cables etc, as far as
> phylib is concerned, the link steadfastly remains "down".
> 
> That's a bit of a problem if your platform is running root-NFS through
> that network interface.
> 
> It looks like some variants of the Micrel phy code work around this by
> including in their resume path code to enable PHY interrupts
> independently of phylib.
> 
> phy_resume() can't be called inside the locked region in phy_start(),
> where we enable interrupts, because genphy_resume() also wants to take
> phydev->lock - and it wants to do that to safely read-modify-write the
> BMCR register.  This, I feel, comes back to an abuse of the phy state
> machine lock to also protect atomic bus operations.
> 
> If we move over to using the bus lock to protect bus atomic operations
> (as I believe we should) then phy_resume() can be called while holding
> phydev->lock from within bits of the code that are protecting themselves
> from concurrency with the phylib state machine.  It also means that we
> can get rid of some of these boolean variables, and most importantly
> in this particular case, call phy_resume() before we enable interrupts.
> 
> With that arrangement, things look a lot better:
> 
> [   74.545584] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> [   74.546010] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> [   74.546029] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> [   74.546264] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> [   74.546354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 8100
> [   74.546373] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 8100
> [   74.546651] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> 
> The patch for this is slightly larger than it needs to be because I've
> converted more places that do read-modify-write to the new xxx_modify()
> accessor, and it also ensures that phy_resume() is consistently called
> with the phy state machine lock held.  It also means we can get rid of
> the interrupt enable hack for some micrel PHYs, which I suspect comes
> from this same root cause.

Since this is a bug fix, I would really rather have the ability to
target the "net" tree with a simpler fix that is contained within
drivers/net/phy/phy.c and which does not require the use of phy_modify().

Thanks for doing all the detective work!
-- 
Florian