netdev - Re: [PATCH net] net: phy: Fix deadlocking in phy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <mk5yter5d6pvdyahfhfruszwp54immvfb3bb7a7chofyhauksb@7vkgyxevt2yv>
Date: Fri, 18 Aug 2023 17:27:22 +0300
From: Serge Semin <fancer.lancer@...il.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: Heiner Kallweit <hkallweit1@...il.com>, 
	Russell King <linux@...linux.org.uk>, "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Francesco Dolcini <francesco.dolcini@...adex.com>, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] net: phy: Fix deadlocking in phy_error() invocation

On Fri, Aug 18, 2023 at 03:07:49PM +0200, Andrew Lunn wrote:
> On Fri, Aug 18, 2023 at 03:54:45PM +0300, Serge Semin wrote:
> >  static void phy_process_error(struct phy_device *phydev)
> >  {
> > -	mutex_lock(&phydev->lock);
> > +	/* phydev->lock must be held for the state change to be safe */
> > +	if (!mutex_is_locked(&phydev->lock))
> > +		phydev_err(phydev, "PHY-device data unsafe context\n");
> > +
> >  	phydev->state = PHY_ERROR;
> > -	mutex_unlock(&phydev->lock);
> >  
> >  	phy_trigger_machine(phydev);
> >  }
> 
> Thanks for the patch Serge. It looks like a good implementation of
> what i suggested. But thinking about it further, if the error ever
> appears in somebodies kernel log, there is probably not enough
> information to actually fix it. There is no call path. So i think it
> should actually use WARN_ON_ONCE() so we get a stack trace.

A trace is already printed by means of WARN()/WARN_ON()
in the phy_process_error() method callers:
phy_error_precise()
and
phy_error()
Wouldn't it be too much to print it twice in a row?

We can redefine phy_error_precise() and phy_process_error() functions
to something like this:

static void phy_process_error(struct phy_device *phydev,
			      const void *func, int err)
{
	if (__ONCE_LITE_IF(!mutex_is_locked(&phydev->lock)))
		WARN(1, "PHY-device data unsafe context\n");
	else if (func)
		WARN(1, "%pS: returned: %d\n", func, err);
	else
		WARN_ON(1);

	phydev->state = PHY_ERROR;

	phy_trigger_machine(phydev);
}

static void phy_error_precise(struct phy_device *phydev,
			      const void *func, int err)
{
        mutex_lock(&phydev->lock);
        phy_process_error(phydev, func, err);
        mutex_unlock(&phydev->lock);
}

void phy_error(struct phy_device *phydev)
{
	phy_process_error(phydev, NULL, 0);
}
EXPORT_SYMBOL(phy_error);

Though in such implementation phy_error_precise() looks redundant. We
can freely move its body to the single user - phy_state_machine()
function.

Note a positive side effect of this implementation is that potentially
phy_error() can be converted to accepting a function pointer caused
the error (phy_read(), phy_write(), etc). Alternatively if the
conversion would look too bulky, phy_error_preciseI() could be just
EXPORT_SYMBOL()-ed with the PHY-device mutex locking being moved to
phy_state_machine().

> 
> Sorry for changing my mind.

No worries.

-Serge(y)

> 
>     Andrew
> 
> ---
> pw-bot: cr