lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <651ab382.df0a0220.e74df.fc51@mx.google.com>
Date: Mon, 2 Oct 2023 14:11:43 +0200
From: Christian Marangi <ansuelsmth@...il.com>
To: Marek Behún <kabel@...nel.org>
Cc: "David S. Miller" <davem@...emloft.net>,
	Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net 2/2] net: dsa: qca8k: fix potential MDIO bus conflict
 when accessing internal PHYs via management frames

On Mon, Oct 02, 2023 at 12:46:12PM +0200, Marek Behún wrote:
> Besides the QCA8337 switch the Turris 1.x device has on it's MDIO bus
> also Micron ethernet PHY (dedicated to the WAN port).
> 
> We've been experiencing a strange behavior of the WAN ethernet
> interface, wherein the WAN PHY started timing out the MDIO accesses, for
> example when the interface was brought down and then back up.
> 
> Bisecting led to commit 2cd548566384 ("net: dsa: qca8k: add support for
> phy read/write with mgmt Ethernet"), which added support to access the
> QCA8337 switch's internal PHYs via management ethernet frames.
> 
> Connecting the MDIO bus pins onto an oscilloscope, I was able to see
> that the MDIO bus was active whenever a request to read/write an
> internal PHY register was done via an management ethernet frame.
> 
> My theory is that when the switch core always communicates with the
> internal PHYs via the MDIO bus, even when externally we request the
> access via ethernet. This MDIO bus is the same one via which the switch
> and internal PHYs are accessible to the board, and the board may have
> other devices connected on this bus. An ASCII illustration may give more
> insight:
> 
>            +---------+
>       +----|         |
>       |    | WAN PHY |
>       | +--|         |
>       | |  +---------+
>       | |
>       | |  +----------------------------------+
>       | |  | QCA8337                          |
> MDC   | |  |                        +-------+ |
> ------o-+--|--------o------------o--|       | |
> MDIO    |  |        |            |  | PHY 1 |-|--to RJ45
> --------o--|---o----+---------o--+--|       | |
>            |   |    |         |  |  +-------+ |
> 	   | +-------------+  |  o--|       | |
> 	   | | MDIO MDC    |  |  |  | PHY 2 |-|--to RJ45
> eth1	   | |             |  o--+--|       | |
> -----------|-|port0        |  |  |  +-------+ |
>            | |             |  |  o--|       | |
> 	   | | switch core |  |  |  | PHY 3 |-|--to RJ45
>            | +-------------+  o--+--|       | |
> 	   |                  |  |  +-------+ |
> 	   |                  |  o--|  ...  | |
> 	   +----------------------------------+
> 
> When we send a request to read an internal PHY register via an ethernet
> management frame via eth1, the switch core receives the ethernet frame
> on port 0 and then communicates with the internal PHY via MDIO. At this
> time, other potential devices, such as the WAN PHY on Turris 1.x, cannot
> use the MDIO bus, since it may cause a bus conflict.
> 
> Fix this issue by locking the MDIO bus even when we are accessing the
> PHY registers via ethernet management frames.
> 
> Fixes: 2cd548566384 ("net: dsa: qca8k: add support for phy read/write with mgmt Ethernet")
> Signed-off-by: Marek Behún <kabel@...nel.org>

Just some comments (micro-optimization) and one question.

Wonder if the extra lock would result in a bit of overhead for simple
implementation where the switch is the only thing connected to the MDIO.

It's just an idea and probably not even something to consider (since
probably the overhead is so little that it's not worth it)

But we might consider to add some logic in the MDIO setup function to
check if the MDIO have other PHY connected and enable this lock (and
make this optional with an if and a bool like require_mdio_locking)

If we don't account for this, yes the lock should have been there from
the start and this is correct. (we can make it optional only in the case
where only the switch is connected as it would be the only user and
everything is already locked by the eth_mgmt lock)

> ---
>  drivers/net/dsa/qca/qca8k-8xxx.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c
> index d2df30640269..4ce68e655a63 100644
> --- a/drivers/net/dsa/qca/qca8k-8xxx.c
> +++ b/drivers/net/dsa/qca/qca8k-8xxx.c
> @@ -666,6 +666,15 @@ qca8k_phy_eth_command(struct qca8k_priv *priv, bool read, int phy,
>  		goto err_read_skb;
>  	}
>  
> +	/* It seems that accessing the switch's internal PHYs via management
> +	 * packets still uses the MDIO bus within the switch internally, and
> +	 * these accesses can conflict with external MDIO accesses to other
> +	 * devices on the MDIO bus.
> +	 * We therefore need to lock the MDIO bus onto which the switch is
> +	 * connected.
> +	 */
> +	mutex_lock(&priv->bus->mdio_lock);
> +

Please move this down before the first dev_queue_xmit. (we can save a
few cycle where locking is not needed)

Also should we use mutex_lock_nested?

>  	/* Actually start the request:
>  	 * 1. Send mdio master packet
>  	 * 2. Busy Wait for mdio master command
> @@ -678,6 +687,7 @@ qca8k_phy_eth_command(struct qca8k_priv *priv, bool read, int phy,
>  	mgmt_master = priv->mgmt_master;
>  	if (!mgmt_master) {
>  		mutex_unlock(&mgmt_eth_data->mutex);
> +		mutex_unlock(&priv->bus->mdio_lock);
>  		ret = -EINVAL;
>  		goto err_mgmt_master;
>  	}
> @@ -765,6 +775,7 @@ qca8k_phy_eth_command(struct qca8k_priv *priv, bool read, int phy,
>  				    QCA8K_ETHERNET_TIMEOUT);
>  
>  	mutex_unlock(&mgmt_eth_data->mutex);
> +	mutex_unlock(&priv->bus->mdio_lock);
>  
>  	return ret;
>  
> -- 
> 2.41.0
> 

-- 
	Ansuel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ