linux-kernel - Re: [RFC net] Revert "net: phy: Fix race condition on link status change"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ksgfr3o6nm4k5qhamsqhyardnpb5fm5xclhr4fwpgldhzjlax6@r5ry4ztkqx5c>
Date:   Wed, 16 Aug 2023 23:03:36 +0300
From:   Serge Semin <fancer.lancer@...il.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Heiner Kallweit <hkallweit1@...il.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Francesco Dolcini <francesco.dolcini@...adex.com>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC net] Revert "net: phy: Fix race condition on link status
 change"

On Wed, Aug 16, 2023 at 09:35:57PM +0200, Andrew Lunn wrote:
> On Wed, Aug 16, 2023 at 09:09:40PM +0300, Serge Semin wrote:
> > Protecting the phy_driver.drv->handle_interrupt() callback invocation by
> > the phy_device.lock mutex causes all the IRQ-capable PHY drivers to lock
> > the mutex twice thus deadlocking on the next calls thread:
> > IRQ: phy_interrupt()
> >      +-> mutex_lock(&phydev->lock); <-------------+
> >          drv->handle_interrupt()                  | Deadlock due to the
> >          +-> phy_error()                          + nested PHY-device
> >              +-> phy_process_error()              | mutex lock
> >                  +-> mutex_lock(&phydev->lock); <-+
> >                      phydev->state = PHY_ERROR;
> >                      mutex_unlock(&phydev->lock);
> >          mutex_unlock(&phydev->lock);
> > 
> > The problem can be easily reproduced just by calling phy_error() from the
> > any PHY-device interrupt handler.
> 
> https://elixir.bootlin.com/linux/v6.5-rc6/source/drivers/net/phy/phy.c#L1201
> 
> /**
>  * phy_error - enter ERROR state for this PHY device
>  * @phydev: target phy_device struct
>  *
>  * Moves the PHY to the ERROR state in response to a read
>  * or write error, and tells the controller the link is down.
>  * Must not be called from interrupt context, or while the
>  * phydev->lock is held.
>  */
> void phy_error(struct phy_device *phydev)
> {
> 	WARN_ON(1);
> 	phy_process_error(phydev);
> }
> EXPORT_SYMBOL(phy_error);
> 
> It is clearly documented you should not do this.
> 
> [Goes and looks]
> 
> Ah, there are lots of examples of
> 
> micrel.c-	irq_status = phy_read(phydev, LAN8814_INTS);
> micrel.c-	if (irq_status < 0) {
> micrel.c:		phy_error(phydev);
> micrel.c-		return IRQ_NONE;
> micrel.c-	}

It's literally the only pattern where the phy_error() method is utilized but
it can be found in all IRQ-capable network PHY drivers. Seeing how
widespread the function usage is I didn't even dared to think that
the function doc could state it can't be utilized like that or when the
lock is held.

> 
> I actually think phy_error() is broken here. The general pattern is
> that the mutex is locked before calling into the driver. So we
> actually want phy_error() to be safe to use with the lock already
> taken. The exceptions when the lock is not held is stuff outside of
> PHY operation, like HWMON, and suspend and resume, plus probe.
> 
> So i suggest you change phy_process_error() to remove the lock.

This doable.

> Maybe
> add a test to ensure the lock is actually held, and do a phydev_err()
> if not.

This can't be done since phy_state_machine() calls phy_error_precise()
which calls phy_process_error() with no phy_device.lock held. Printing the
error in that case would mean an error in the Networking PHY subsystem
itself.

Do you suggest to take the lock before calling phy_error_precise() then?

> 
> The comment about interrupt context is also probably bogus. phylib
> only uses threaded interrupts, and it is safe to block in this
> context.

Ok.

-Serge(y)

> 
> 	Andrew