[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZGzOZm0oJimLHCDA@shell.armlinux.org.uk>
Date: Tue, 23 May 2023 15:32:06 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: David Epping <david.epping@...singlinkelectronics.com>
Cc: Andrew Lunn <andrew@...n.ch>, Vladimir Oltean <olteanv@...il.com>,
Heiner Kallweit <hkallweit1@...il.com>,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
UNGLinuxDriver@...rochip.com
Subject: Re: [PATCH net v2 0/3] net: phy: mscc: support VSC8501
On Tue, May 23, 2023 at 03:32:36PM +0200, David Epping wrote:
> On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote:
> > > - I left the mutex_lock(&phydev->lock) in the
> > > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
> > > is required to repeatedly access phydev->interface and
> > > phy_interface_is_rgmii(phydev) in a consistent way.
> >
> > Just adding to Russell comment.
> >
> > As a general rule of thumb, if your driver is doing something which no
> > other driver is doing, you have to consider if it is correct. A PHY
> > driver taking phydev->lock is very unusual. So at minimum you should
> > be able to explain why it is needed. And when it comes to locking,
> > locking is hard, so you really should understand it.
> >
> > Now the mscc is an odd device, because it has multiple PHYs in the
> > package, and a number of registers are shared between these PHYs. So
> > it does have different locking requirements to most PHYs. However, i
> > don't think that is involved here. Those oddities are hidden behind
> > phy_base_write() and phy_base_read().
> >
> > Andrew
>
> Russell, Andrew,
>
> as you stated, locking is hard, and I am not in detail familiar with
> the mscc driver and the supported PHYs behavior. Also, I only have
> VSC8501, the single PHY chip, and none of the multi PHY chips to test.
> And testing these corner cases and race conditions is hard anyways.
> Thus my current patch is not touching the locking code at all, and
> assumes the current mainline code is correct in that regard.
> Because I don't understand all implications, I'm hesitant to change
> the existing locking scheme.
> Maybe this can be a separate patch? In the current patch set I'm not
> making the situation worse (unlike the first one where I added locks
> which Russell pointed out).
> If you insist and all agree the locks should be removed with this
> patch set, I'll update it of course.
Reading through this driver, IMHO it's clear that the original author
didn't have much idea about locking.
Your assumption that taking phydev->lock in vsc85xx_rgmii_set_skews()
protects phydev->interface is provably false, because:
static int vsc8584_config_init(struct phy_device *phydev)
{
...
if (phy_interface_is_rgmii(phydev)) {
ret = vsc85xx_rgmii_set_skews(phydev, VSC8572_RGMII_CNTL,
VSC8572_RGMII_RX_DELAY_MASK,
VSC8572_RGMII_TX_DELAY_MASK);
This accesses phydev->interface without holding phydev->lock,
before entering vsc85xx_rgmii_set_skews().
The second place that vsc85xx_rgmii_set_skews() is called from is
vsc85xx_default_config() which also accesses phydev->interface,
again without taking the phydev->lock.
So both paths into vsc85xx_rgmii_set_skews() have already read
phydev->interface without taking the lock. If this was what the
lock in vsc85xx_rgmii_set_skews() was protecting, then surely it
would need to protect those reads as well! It doesn't.
Also, with knowledge of phylib, I can say that this lock is
completely unnecessary when accessing phydev->interface in any
PHY driver .config_init method, which is the only place that
vsc85xx_rgmii_set_skews() is called from.
Having read the rest of the driver, it would appear that phydev->lock
is being abused to protect register accesses. This is evidenced by
the following, where I also set out why it's wrong:
vsc85xx_led_cntl_set()... which should be using phy_modify(), not
phy_read()..modify..phy_write(), which is open to races e.g. from
userspace MDIO access. phydev->lock doesn't solve anything there.
vsc85xx_edge_rate_cntl_set()... which correctly uses
phy_modify_paged() which itself will correctly prevent racy accesses
by taking the MDIO bus lock. It makes no accesses to anything else,
so phydev->lock here is entirely unnecessary.
vsc85xx_mac_if_set()... which is another case of racy access in the
same way as vsc85xx_led_cntl_set().
vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set()... both of
which IMHO show a complete misunderstanding for locking. At least
both of these functions are safe from other threads accessing the
bus because they correctly use phy_select_page()...phy_restore_page()
(which uses the MDIO bus lock to guarantee no other access will
happen.) BTW, I'm the author of phy_select_page()...phy_restore_page()
which were added to ensure that PHY drivers can _safely_ access
paged registers without fear of anything else disrupting accesses
to those paged registers, not even from userspace.
Essentially, taking phydev->lock offers absolutely zero protection
against another thread making accesses to the PHYs registers. The
*only* lock which prevents concurrent access to registers on devices
on a MDIO bus is the MDIO bus lock.
The only locking decision that I can see in this driver that is
correct is in phy_base_(read|write) which correctly demand that the
MDIO bus lock is held.
Oh my, things get even more "fun"...
vsc8574_config_pre_init() requires the MDIO bus lock to be held when
its called. This function uses request_firmware(), which can call out
to userspace and then *block* waiting for userspace to respond. This
will block *all* access to any device on that MDIO bus until the
firmware request has been satisfied.
Sorry, but the locking in this driver is nothing but a mess.
I'm sorry that you're the one who's modifying the driver when we've
spotted this, but please can you add a patch which first removes
phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on
top please?
At least that starts to reduce the amount of broken locking in this
driver.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists