[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8fe8cb57-7d63-894a-9550-938991e1bf67@gmail.com>
Date: Tue, 19 May 2020 22:37:43 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: Val Jurin <valjurin@....net.ru>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
Russell King - ARM Linux <linux@...linux.org.uk>
Subject: Re: phydev control race condition while AUTONEG_DISABLE
On 18.05.2020 18:40, Val Jurin wrote:
> Hello everybody.
>
> This letter was sent to you because you are signed in as MAINTAINERS
> for ETHERNET PHY LIBRARY topic.
>
> Issue:
> It seems I found "race condition" situation for PHY devices in AUTONEG_DISABLE mode.
>
> Description:
> I found this issue in my working kernel 4.11.7 but I cloned git master from Linus' git master branch
> and practically the same code is still in master. ( for now ver. 5.7.0-rc5 )
>
> Please look at the part of file
> drivers/net/phy/phy.c
> with my comments:
>
> === code ===
> int phy_ethtool_ksettings_set(struct phy_device *phydev,
> const struct ethtool_link_ksettings *cmd)
> {
>
> /* THE PART OF THE CODE SKIPPED */
>
> phydev->autoneg = autoneg;
>
> phydev->speed = speed;
>
> linkmode_copy(phydev->advertising, advertising);
>
> linkmode_mod_bit(ETHTOOL_LINK_MODE_Autoneg_BIT,
> phydev->advertising, autoneg == AUTONEG_ENABLE);
>
> phydev->duplex = duplex;
>
> /* LET'S MARK THIS CODE POINT AS THE "POINT 1" */
>
> phydev->mdix_ctrl = cmd->base.eth_tp_mdix_ctrl;
>
> /* Restart the PHY */
>
> /* AND WE ARE GOING TO phy_start_aneg */
>
> phy_start_aneg(phydev);
>
> return 0;
> }
> EXPORT_SYMBOL(phy_ethtool_ksettings_set);
> === EoF code ===
>
> Now look at the beginning of phy_start_aneg:
>
> === code ===
> int phy_start_aneg(struct phy_device *phydev)
> {
> int err;
>
> if (!phydev->drv)
> return -EIO;
>
> /* LET'S MARK THIS CODE POINT AS THE "POINT 2" */
> /* JUST BEFORE MUTEX LOCK */
> mutex_lock(&phydev->lock);
>
> /* THE FOLLOWING CODE SKIPPED */
> === EoF code ===
>
>
> The statement.
>
> If the short code chain between "POINT 1" and "POINT 2" is running
> and physical device is in AUTONEG_DISABLE mode
> and one of the following asynchronous events occur between these points of code:
>
> a) phydev related status change interrupt (with following hi-prio tasklet);
> b) task rescheduling which may call phydev status POLL routine
> (if phydev is in non-interrupt-driven mode);
>
> the following fields in struct phy_device:
>
> phydev->speed
> phydev->duplex
>
> will be rewritten by phy_device read_status function.
>
> For example - genphy_read_status ( more strict by genphy_read_status_fixed in kernel v.5 )
>
> and the command from ethtool will not set correct speed and duplex which
> stays with old values of speed/duplex.
>
> The probability of this event is vanishingly small and practically may never
> occur because:
> a) most of phy devices work in AUTONEG_ENABLE mode;
> b) the affected code chain is too short. It's just a dozen of CPU commands
> mostly for functions epilogue and prologue and the probability of incoming
> async events for phydev status change is practically close to zero while
> this code chain is running.
>
> Anyway, it is a cause for race condition.
>
> How to emulate:
> Just inject something like
> phydev->drv->config_aneg(phydev)
> in any place of the described code chain between "POINT 1" and "POINT 2"
>
> How it was found :
> I've got a card with MX6UL (NXP ARM SOC) with 2x FEC + micrel 8081 PHY.
> The interrupt line of one of these PHY devices was tied low by adjusters :-E
> and this caused high frequency interrupts + phy read_status call with interrupt rate
> limited by interrupt controller or something else. But it was enough to see
> that ethtool sets speed and duplex one time of three (or more) tries.
> Elsewhere it may never occur and it will never be found :)
>
> Possible solution:
> Everything is normal in AUTONEG_ENABLE mode because the fields
> "speed" and "duplex" are not used in ethtool command by config_aneg
> which uses advertise fields as command data in this case.
>
> It looks like the struct phy_device must have separated fields for command
> and for status change:
>
> phydev->speed
> phydev->duplex
>
> must be left for status
> but, for example
>
> phydev->speed_req
> phydev->duplex_req
>
> must be used for ethtool in phy_ethtool_ksettings_set (*)
> and then used as command data in phy_start_aneg or later
> in appropriate place _after_ phy_start_aneg locks phydev mutex.
>
I see the point. It should be sufficient to add an unlocked version
of phy_start_aneg() and protect the critical section in
phy_ethtool_ksettings_set() with the phydev mutex.
Could you please test with the two patches below?
> (*) as you know this is the default, other drivers may use their own ethtool ksettings
>
> I can not do patch/changes myself because this code is historically very old and
> trusted and very important and can affect not only the functions described above.
> I think you better know what to do or leave it "as is" because the
> probability is too low. But as I suspect and as I was taught ...
> if probability is very very close to zero and it is not still strictly zero
> just find the way what needs to be done to fix it.
> That was the reason to send you this letter.
>
> Thank you for attention.
>
> Regards,
> Valentine Jurin,
> hardware support, kernel and boot maintenance,
> NSG Ltd. , http://www.nsg.ru
>
> P.S Also we want to say respects to you and to all kernel team for your great job.
> Thank you very much.
> We like Linux.
>
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index d4bbf79da..d945be675 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -602,7 +602,7 @@ static int phy_check_link_status(struct phy_device *phydev)
}
/**
- * phy_start_aneg - start auto-negotiation for this PHY device
+ * __phy_start_aneg - start auto-negotiation for this PHY device
* @phydev: the phy_device struct
*
* Description: Sanitizes the settings (if we're not autonegotiating
@@ -610,25 +610,33 @@ static int phy_check_link_status(struct phy_device *phydev)
* If the PHYCONTROL Layer is operating, we change the state to
* reflect the beginning of Auto-negotiation or forcing.
*/
-int phy_start_aneg(struct phy_device *phydev)
+int __phy_start_aneg(struct phy_device *phydev)
{
int err;
if (!phydev->drv)
return -EIO;
- mutex_lock(&phydev->lock);
-
if (AUTONEG_DISABLE == phydev->autoneg)
phy_sanitize_settings(phydev);
err = phy_config_aneg(phydev);
if (err < 0)
- goto out_unlock;
+ return err;
if (phy_is_started(phydev))
err = phy_check_link_status(phydev);
-out_unlock:
+
+ return err;
+}
+EXPORT_SYMBOL(__phy_start_aneg);
+
+int phy_start_aneg(struct phy_device *phydev)
+{
+ int err;
+
+ mutex_lock(&phydev->lock);
+ err = __phy_start_aneg(phydev);
mutex_unlock(&phydev->lock);
return err;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 5d8ff5428..ea9ba6d1c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1244,6 +1244,7 @@ void phy_disconnect(struct phy_device *phydev);
void phy_detach(struct phy_device *phydev);
void phy_start(struct phy_device *phydev);
void phy_stop(struct phy_device *phydev);
+int __phy_start_aneg(struct phy_device *phydev);
int phy_start_aneg(struct phy_device *phydev);
int phy_aneg_done(struct phy_device *phydev);
int phy_speed_down(struct phy_device *phydev, bool sync);
--
2.26.2
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index d945be675..7bf37ec3f 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -291,6 +291,8 @@ int phy_ethtool_ksettings_set(struct phy_device *phydev,
duplex != DUPLEX_FULL)))
return -EINVAL;
+ mutex_lock(&phydev->lock);
+
phydev->autoneg = autoneg;
phydev->speed = speed;
@@ -305,7 +307,9 @@ int phy_ethtool_ksettings_set(struct phy_device *phydev,
phydev->mdix_ctrl = cmd->base.eth_tp_mdix_ctrl;
/* Restart the PHY */
- phy_start_aneg(phydev);
+ __phy_start_aneg(phydev);
+
+ mutex_unlock(&phydev->lock);
return 0;
}
--
2.26.2
Powered by blists - more mailing lists