lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171210164752.GO10595@n2100.armlinux.org.uk>
Date:   Sun, 10 Dec 2017 16:47:52 +0000
From:   Russell King - ARM Linux <linux@...linux.org.uk>
To:     Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>
Cc:     netdev@...r.kernel.org
Subject: [BUG] micrel phy suspend/resume

Guys,

I've just tripped over a bug with the Micrel PHY driver, but it
really isn't specific to the Micrel PHY driver.

When we suspend, we suspend the PHY and then the MAC driver (eg,
on the ZII board):

[  198.822751] 400d0000.ethernet-1:00: bus : mdio_bus_suspend+0x0/0x34
[  198.822859] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
[  198.822878] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3900
...
[  198.826235] 400d0000.ethernet: bus : platform_pm_freeze+0x0/0x5c
[  198.826354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
[  198.826374] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
[  198.826503] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0000
[  198.826699] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000

When we resume, the order is reversed:

[  198.848300] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
[  198.849024] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
[  198.849120] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
[  198.849141] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
[  198.849243] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
[  198.849401] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
[  198.849419] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
[  198.849637] __mdiobus_read: 61 400d0000.ethernet-1 00 01 => 7849
...
[  198.852677] 400d0000.ethernet-1:00: bus : mdio_bus_resume+0x0/0x34
[  198.852778] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
[  198.852797] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100

Now, the MAC driver calls phy_stop() and phy_start() from within its
own suspend/resume methods, and at this is done while the PHY is,
as far as the kernel PM code is concerned, suspended.

However, phylib works around that by resuming the PHY itself when
phy_start() is called in this situation.  That looks good, but it
really isn't in this case.  Given the above sequence, we will be in
PHY_HALTED state.

So, when phy_start() is called, the first thing it does is check the
state, and finds PHY_HALTED.  It then tries to enable the PHY
interrupts.  This cause the Micrel driver to write to the PHY to
enable interrupts - it reads 0x1b to ack any pre-existing interrupt,
modifies 0x1f to set the interrupt pin level, and then supposedly
writes 0x1b to enable interrupts.

However, at this point, the PHY is still powered down, as we can see
in the following read of the BMCR - containing 0x3900.  The write to
enable interrupts is ignored, and so interrupts remain disabled.

The resume process continues, and the system resumes, but interrupts
on the PHY remain disabled, and the phylib state machine never
advances.  You can do anything you like with cables etc, as far as
phylib is concerned, the link steadfastly remains "down".

That's a bit of a problem if your platform is running root-NFS through
that network interface.

It looks like some variants of the Micrel phy code work around this by
including in their resume path code to enable PHY interrupts
independently of phylib.

phy_resume() can't be called inside the locked region in phy_start(),
where we enable interrupts, because genphy_resume() also wants to take
phydev->lock - and it wants to do that to safely read-modify-write the
BMCR register.  This, I feel, comes back to an abuse of the phy state
machine lock to also protect atomic bus operations.

If we move over to using the bus lock to protect bus atomic operations
(as I believe we should) then phy_resume() can be called while holding
phydev->lock from within bits of the code that are protecting themselves
from concurrency with the phylib state machine.  It also means that we
can get rid of some of these boolean variables, and most importantly
in this particular case, call phy_resume() before we enable interrupts.

With that arrangement, things look a lot better:

[   74.545584] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
[   74.546010] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
[   74.546029] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
[   74.546264] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
[   74.546354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 8100
[   74.546373] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 8100
[   74.546651] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500

The patch for this is slightly larger than it needs to be because I've
converted more places that do read-modify-write to the new xxx_modify()
accessor, and it also ensures that phy_resume() is consistently called
with the phy state machine lock held.  It also means we can get rid of
the interrupt enable hack for some micrel PHYs, which I suspect comes
from this same root cause.

Something similar ought to be done for phy_suspend() as well, but that's
a little more complex because it also calls get_wol() functions.

More places could probably be converted to phy_modify() too.

 drivers/net/phy/at803x.c     | 24 +++--------------
 drivers/net/phy/mdio_bus.c   | 32 ++++++++++++++++++++++
 drivers/net/phy/micrel.c     | 15 +----------
 drivers/net/phy/phy.c        |  9 +++----
 drivers/net/phy/phy_device.c | 63 +++++++++++++-------------------------------
 include/linux/mdio.h         |  1 +
 include/linux/phy.h          | 18 +++++++++++++
 7 files changed, 77 insertions(+), 85 deletions(-)

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index 4c75cfcdeaec..a20d1f5e4dbf 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -258,38 +258,22 @@ static int at803x_suspend(struct phy_device *phydev)
 	int value;
 	int wol_enabled;
 
-	mutex_lock(&phydev->lock);
-
 	value = phy_read(phydev, AT803X_INTR_ENABLE);
 	wol_enabled = value & AT803X_INTR_ENABLE_WOL;
 
-	value = phy_read(phydev, MII_BMCR);
-
 	if (wol_enabled)
-		value |= BMCR_ISOLATE;
+		value = BMCR_ISOLATE;
 	else
-		value |= BMCR_PDOWN;
-
-	phy_write(phydev, MII_BMCR, value);
+		value = BMCR_PDOWN;
 
-	mutex_unlock(&phydev->lock);
+	phy_modify(phydev, MII_BMCR, value, value);
 
 	return 0;
 }
 
 static int at803x_resume(struct phy_device *phydev)
 {
-	int value;
-
-	mutex_lock(&phydev->lock);
-
-	value = phy_read(phydev, MII_BMCR);
-	value &= ~(BMCR_PDOWN | BMCR_ISOLATE);
-	phy_write(phydev, MII_BMCR, value);
-
-	mutex_unlock(&phydev->lock);
-
-	return 0;
+	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN | BMCR_ISOLATE, 0);
 }
 
 static int at803x_probe(struct phy_device *phydev)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 1bf3adcdcbac..d5adb00eafdf 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -651,6 +651,38 @@ int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val)
 EXPORT_SYMBOL(mdiobus_write);
 
 /**
+ * mdiobus_modify - Convenience function for writing a given MII mgmt register
+ * @bus: the mii_bus struct
+ * @addr: the phy address
+ * @regnum: register number to write
+ * @mask: bits to mask off from the register @regnum
+ * @val: new value of bits set in mask to write to @regnum
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+int mdiobus_modify(struct mii_bus *bus, int addr, u32 regnum, u16 mask, u16 val)
+{
+	int retval;
+	int err;
+
+	BUG_ON(in_interrupt());
+
+	mutex_lock(&bus->mdio_lock);
+	err = retval = __mdiobus_read(bus, addr, regnum);
+	if (err >= 0) {
+		retval &= ~mask;
+		retval |= val & mask;
+		err = __mdiobus_write(bus, addr, regnum, retval);
+	}
+	mutex_unlock(&bus->mdio_lock);
+
+	return err;
+}
+EXPORT_SYMBOL(mdiobus_modify);
+
+/**
  * mdio_bus_match - determine if given MDIO driver supports the given
  *		    MDIO device
  * @dev: target MDIO device
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index fdb43dd9b5cd..9e9438caa2b7 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -711,22 +711,9 @@ static int kszphy_suspend(struct phy_device *phydev)
 
 static int kszphy_resume(struct phy_device *phydev)
 {
-	int ret;
-
 	genphy_resume(phydev);
 
-	ret = kszphy_config_reset(phydev);
-	if (ret)
-		return ret;
-
-	/* Enable PHY Interrupts */
-	if (phy_interrupt_is_valid(phydev)) {
-		phydev->interrupts = PHY_INTERRUPT_ENABLED;
-		if (phydev->drv->config_intr)
-			phydev->drv->config_intr(phydev);
-	}
-
-	return 0;
+	return kszphy_config_reset(phydev);
 }
 
 static int kszphy_probe(struct phy_device *phydev)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index f2a83ab00e71..874740957606 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -844,7 +844,6 @@ EXPORT_SYMBOL(phy_stop);
  */
 void phy_start(struct phy_device *phydev)
 {
-	bool do_resume = false;
 	int err = 0;
 
 	mutex_lock(&phydev->lock);
@@ -857,6 +856,9 @@ void phy_start(struct phy_device *phydev)
 		phydev->state = PHY_UP;
 		break;
 	case PHY_HALTED:
+		/* if phy was suspended, bring the physical link up again */
+		phy_resume(phydev);
+
 		/* make sure interrupts are re-enabled for the PHY */
 		if (phydev->irq != PHY_POLL) {
 			err = phy_enable_interrupts(phydev);
@@ -865,17 +867,12 @@ void phy_start(struct phy_device *phydev)
 		}
 
 		phydev->state = PHY_RESUMING;
-		do_resume = true;
 		break;
 	default:
 		break;
 	}
 	mutex_unlock(&phydev->lock);
 
-	/* if phy was suspended, bring the physical link up again */
-	if (do_resume)
-		phy_resume(phydev);
-
 	phy_trigger_machine(phydev, true);
 }
 EXPORT_SYMBOL(phy_start);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67f25ac29025..d0d98e6f6406 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -135,7 +135,9 @@ static int mdio_bus_phy_resume(struct device *dev)
 	if (!mdio_bus_phy_may_suspend(phydev))
 		goto no_resume;
 
+	mutex_lock(&phydev->lock);
 	ret = phy_resume(phydev);
+	mutex_unlock(&phydev->lock);
 	if (ret < 0)
 		return ret;
 
@@ -1026,7 +1028,9 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	if (err)
 		goto error;
 
+	mutex_lock(&phydev->lock);
 	phy_resume(phydev);
+	mutex_unlock(&phydev->lock);
 	phy_led_triggers_register(phydev);
 
 	return err;
@@ -1157,6 +1161,8 @@ int phy_resume(struct phy_device *phydev)
 	struct phy_driver *phydrv = to_phy_driver(phydev->mdio.dev.driver);
 	int ret = 0;
 
+	WARN_ON(!mutex_is_locked(&phydev->lock));
+
 	if (phydev->drv && phydrv->resume)
 		ret = phydrv->resume(phydev);
 
@@ -1322,9 +1328,8 @@ static int genphy_config_eee_advert(struct phy_device *phydev)
  */
 int genphy_setup_forced(struct phy_device *phydev)
 {
-	int ctl = phy_read(phydev, MII_BMCR);
+	u16 ctl = 0;
 
-	ctl &= BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN;
 	phydev->pause = 0;
 	phydev->asym_pause = 0;
 
@@ -1336,7 +1341,10 @@ int genphy_setup_forced(struct phy_device *phydev)
 	if (DUPLEX_FULL == phydev->duplex)
 		ctl |= BMCR_FULLDPLX;
 
-	return phy_write(phydev, MII_BMCR, ctl);
+	return phy_modify(phydev, MII_BMCR,
+			  BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN |
+			  BMCR_SPEED1000 | BMCR_SPEED100 | BMCR_FULLDPLX,
+			  ctl);
 }
 EXPORT_SYMBOL(genphy_setup_forced);
 
@@ -1346,17 +1354,10 @@ EXPORT_SYMBOL(genphy_setup_forced);
  */
 int genphy_restart_aneg(struct phy_device *phydev)
 {
-	int ctl = phy_read(phydev, MII_BMCR);
-
-	if (ctl < 0)
-		return ctl;
-
-	ctl |= BMCR_ANENABLE | BMCR_ANRESTART;
-
 	/* Don't isolate the PHY if we're negotiating */
-	ctl &= ~BMCR_ISOLATE;
-
-	return phy_write(phydev, MII_BMCR, ctl);
+	return phy_modify(phydev, MII_BMCR,
+			  BMCR_ANENABLE | BMCR_ANRESTART | BMCR_ISOLATE,
+			  BMCR_ANENABLE | BMCR_ANRESTART);
 }
 EXPORT_SYMBOL(genphy_restart_aneg);
 
@@ -1622,48 +1623,20 @@ EXPORT_SYMBOL(genphy_config_init);
 
 int genphy_suspend(struct phy_device *phydev)
 {
-	int value;
-
-	mutex_lock(&phydev->lock);
-
-	value = phy_read(phydev, MII_BMCR);
-	phy_write(phydev, MII_BMCR, value | BMCR_PDOWN);
-
-	mutex_unlock(&phydev->lock);
-
-	return 0;
+	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN, BMCR_PDOWN);
 }
 EXPORT_SYMBOL(genphy_suspend);
 
 int genphy_resume(struct phy_device *phydev)
 {
-	int value;
-
-	mutex_lock(&phydev->lock);
-
-	value = phy_read(phydev, MII_BMCR);
-	phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN);
-
-	mutex_unlock(&phydev->lock);
-
-	return 0;
+	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN, 0);
 }
 EXPORT_SYMBOL(genphy_resume);
 
 int genphy_loopback(struct phy_device *phydev, bool enable)
 {
-	int value;
-
-	value = phy_read(phydev, MII_BMCR);
-	if (value < 0)
-		return value;
-
-	if (enable)
-		value |= BMCR_LOOPBACK;
-	else
-		value &= ~BMCR_LOOPBACK;
-
-	return phy_write(phydev, MII_BMCR, value);
+	return phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
+			  enable ? BMCR_LOOPBACK : 0);
 }
 EXPORT_SYMBOL(genphy_loopback);
 
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index 4be30adc033b..8e69211685be 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -264,6 +264,7 @@ int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum);
 int mdiobus_read_nested(struct mii_bus *bus, int addr, u32 regnum);
 int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val);
 int mdiobus_write_nested(struct mii_bus *bus, int addr, u32 regnum, u16 val);
+int mdiobus_modify(struct mii_bus *bus, int addr, u32 regnum, u16 mask, u16 val);
 
 int mdiobus_register_device(struct mdio_device *mdiodev);
 int mdiobus_unregister_device(struct mdio_device *mdiodev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 03a259f128f9..afe5d963a98c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -766,6 +766,24 @@ static inline int __phy_write(struct phy_device *phydev, u32 regnum, u16 val)
 }
 
 /**
+ * phy_modify - Convenience function for modifying a given PHY register
+ * @phydev: the phy_device struct
+ * @regnum: register number to write
+ * @mask: bits to mask off from the register @regnum
+ * @val: new value of bits set in mask to write to @regnum
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+static inline int phy_modify(struct phy_device *phydev, u32 regnum, u16 mask,
+			     u16 val)
+{
+	return mdiobus_modify(phydev->mdio.bus, phydev->mdio.addr, regnum,
+			      mask, val);
+}
+
+/**
  * phy_interrupt_is_valid - Convenience function for testing a given PHY irq
  * @phydev: the phy_device struct
  *

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ