lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <96d32ce8-394b-4454-8910-a66be2813588@cherry.de>
Date: Fri, 13 Jun 2025 16:27:54 +0200
From: Quentin Schulz <quentin.schulz@...rry.de>
To: Andrew Lunn <andrew@...n.ch>
Cc: Jakob Unterwurzacher <jakobunt@...il.com>, foss+kernel@...il.net,
 conor+dt@...nel.org, devicetree@...r.kernel.org, heiko@...ech.de,
 jakob.unterwurzacher@...rry.de, krzk+dt@...nel.org,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 linux-rockchip@...ts.infradead.org, robh@...nel.org,
 Kever Yang <kever.yang@...k-chips.com>
Subject: Re: [PATCH v2] arm64: dts: rockchip: support Ethernet Switch adapter
 for RK3588 Jaguar

Hi Andrew,

On 5/28/25 3:09 PM, Andrew Lunn wrote:
> On Wed, May 28, 2025 at 09:56:51AM +0200, Quentin Schulz wrote:
>> Hi Andrew,
>>
>> On 5/27/25 6:18 PM, Andrew Lunn wrote:
>>> On Tue, May 27, 2025 at 03:11:42PM +0200, Jakob Unterwurzacher wrote:
>>>>> @Jakob, is this something you could check? devmem2 0xfd58c31c w 0x3c0000
>>>>> should do the trick to disable the circuitry according to the TRM?
>>>>
>>>> I measured TXCLK vs TXD3 on an oscilloscope on gmac1:
>>>>
>>>> 	Setting	Decimal	Actual TXCLK delay (ps)
>>>> 	00	0	47
>>>> 	0a	10	283
>>>> 	10	16	440
>>>> 	20	32	893
>>>> 	30	48	1385
>>>> 	40	64	1913
>>>> 	50	80	2514
>>>> 	60	96	3077
>>>> 	70	112	3565
>>>> 	7f	127	4009
>>>>
>>>> 	off	x	-315
>>>>
>>>> Setting = tx_delay (hex)
>>>> Decimal = tx_delay (dec)
>>>> Actual TXCLK delay (ps) = Measurement from oscilloscope
>>>>
>>>> Plotting this we can deduce that one tx_delay unit is about 31ps.
>>>
>>> Nice to see somebody actually do the measurements. Based on this, it
>>> would be good to implement:
>>>
>>>           tx-internal-delay-ps:
>>>             description:
>>>               RGMII Transmit Clock Delay defined in pico seconds. This is used for
>>>               controllers that have configurable TX internal delays. If this
>>>               property is present then the MAC applies the TX delay.
>>>
>>> For the moment, please limit it to just the device you measured it on.
>>>
>>
>> What exactly do you mean with "limit it to just the device you measured it
>> on"?
> 
> Nobody seems to know if rx_delay & tx_delay operate the same across
> the whole range of SoCs. I don't particularly care if these properties
> are difference between SoC, they are vendor properties, with
> undocumented magic values. However 'tx-internal-delay-ps' is
> standardised, and has a very clear meaning. I don't want it used
> unless somebody has performed a measurement and we know that 2000
> produces a 2ns delay.
> 
>> I'll need to implement reading the delay from the stmmac driver to use this
>> property, do I need to restrict reading this property to the SoC we tested
>> (RK3588)?
> 
> Yes, please only allow it to be used on RK3588, and any other SoC you
> can test and verify its behaviour.

Coming back to this topic, I'm unfortunately the bearer of some bad news.

I implemented the suggested logic (see at the end of this mail) and then 
went to validate it with Jakob's help. Unfortunately, it seems that the 
delay value really isn't stable or reliable.

We tested the same adapter with two different main boards (the same 
product, just two different units). With a value of 0x40 for tx_delay 
(which should be ~2000ps if we have a 31ps per tx_delay unit as 
empirically decided), we have one board with 1778ps and one with 1391ps. 
Following a hunch, we started to stress (or cool) the device (with 
stress-ng/a fan) and it did slightly change the result too. Changing the 
CPU operating points (and by extension at least CPU clocks) didn't 
impact the result though.

While this could be observed with tx_delay property too, this property 
doesn't claim to provide a value in picoseconds that 
tx-internal-delay-ps would (but at the same time this didn't stop it to 
be implemented for the DSA switch we have which claims "more than 1.5ns" 
and nothing more, so maybe that would be acceptable?).

I feel uncomfortable contributing this considering the wildly different 
results across our very small test sample pool of two units and slightly 
different operating temperature.

Cheers,
Quentin

Here's the patch I made:

"""
 From 4ad2976b08b0ca93504942a80aba4bfe76fdbc51 Mon Sep 17 00:00:00 2001
From: Quentin Schulz <quentin.schulz@...rry.de>
Date: Wed, 4 Jun 2025 14:12:01 +0200
Subject: [PATCH] net: stmmac: dwmac-rk: add support for tx-internal-delay-ps
  for RK3588

The RGMII v2.0 spec[1] specifies the clock output needs to be delayed by
at least 1.2ns (typically 2.0ns) compared to the data lanes.

As documented[2], there are multiple ways to implement this delay. The
PCB routing adds this delay by making the clock lane longer than the
data lanes. The MAC adds the delay. The PHY adds the delay. Mixed delays
are heavily discouraged. The first one is represented by rgmii phy-mode,
the second and third one as rgmii-id. Since RGMII has two clocks (RX and
TX), it is possible their delay implementation differ. In which case,
rgmii-rxid represents RX delay added by the MAC/PHY while TX delay is
added by way of PCB routing, and rgmii-txid represents the opposite.

The Rockchip DWMAC driver actually implements the RGMII internal delay
the complete opposite way.
The driver allows to specify the RX and TX delay (using an undocumented
and yet unknown unit) through the Device Tree (via rx_delay and
tx_delay). However, rx_delay is only applied when in rgmii or rgmii-txid
mode and tx_delay only when in rgmii or rgmii-rxid.
This means the driver is currently entirely incompatible with the Device
Tree binding as in order to add a delay on the MAC we need to set
phy-mode on the MAC side to rgmii instead of rgmii-id, rgmii-rxid
instead of rgmii-txid or vice-versa, along with rx_delay/tx_delay
properties.

Because we cannot break backward compatibility with existing Device
Trees, an attempt to fix this requires a new property. Fortunately, the
Ethernet controller binding already defines such property:
tx-internal-delay-ps and rx-internal-delay-ps. Those represent delays in
picoseconds.

Unfortunately, there's no documentation as far as I could tell about the
relation between an rx_delay/tx_delay to write in the DWMAC registers on
Rockchip and the actual delay in picoseconds.
However, based on empirical data[3], it seems a unit in the
gmac{0,1}_clk_tx_dl_cfg bitfield SYS_GRF_SOC_CON{8,9} equals roughly 31
picoseconds of delay on RK3588. Considering that the RGMII spec only
requires a delay of at least 1200ps, a granularity of 31ps seems
acceptable even if it later turns out the scale isn't perfectly linear.
Users will typically set tx-internal-delay-ps above 1200 (in
tx_delay "units": above 0x27).

So this adds support for setting up the delay via tx-internal-delay-ps
on RK3588 only as there's no empirical data for other SoCs just yet and
only for TX delay as we cannot measure RX delay within the SoC.

Setting both tx_delay and tx-internal-delay-ps is nonsensical as they
represent the same delay but in entirely different HW setups that are
defined in Device Tree, so this is forbidden by the driver.

Only print error messages if tx_delay is absent AND tx-internal-delay-ps
is absent as well. Still allow tx_delay to be absent for backward
compatibility.

If one really has a PCB which only adds an RX delay, one unfortunately
still needs to set rx_delay to 0 in Device Tree in addition to setting
the phy-mode to rgmii-txid as a missing rx_delay property will make the
RX delay in the MAC default to 0x10.

If tx-internal-delay-ps is missing, tx_delay_from_ps is guaranteed to be
0 thanks to kzalloc allocation of bsp_priv rk_priv_data struct so
current behavior is left unchanged for existing Device Trees and SoCs
which do not currently support tx-internal-delay-ps (all but RK3588).

If one has a board which adds the TX delay on the PHY side, setting
phy-mode to rgmii-txid/rgmii-id in the Device Tree is technically enough
as tx-internal-delay-ps = <0> or a missing tx-internal-delay-ps will
result in the same behavior.

[1] https://etherealwake.com/2025/03/ethernet-rgmii/rgmii_2_0.pdf
[2] 
https://elixir.bootlin.com/linux/v6.15/source/Documentation/devicetree/bindings/net/ethernet-controller.yaml#L287
[3] 
https://lore.kernel.org/linux-rockchip/20250527131142.1100673-1-jakob.unterwurzacher@cherry.de/
Suggested-by: Andrew Lunn <andrew@...n.ch>
Signed-off-by: Quentin Schulz <quentin.schulz@...rry.de>
---
  .../net/ethernet/stmicro/stmmac/dwmac-rk.c    | 29 +++++++++++++++----
  1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index 700858ff6f7c3..63730a334f0a9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -77,6 +77,7 @@ struct rk_priv_data {
  	struct reset_control *phy_reset;

  	int tx_delay;
+	int tx_delay_from_ps;
  	int rx_delay;

  	struct regmap *grf;
@@ -1670,6 +1671,7 @@ static struct rk_priv_data *rk_gmac_setup(struct 
platform_device *pdev,
  {
  	struct rk_priv_data *bsp_priv;
  	struct device *dev = &pdev->dev;
+	bool of_internal_delay_ps;
  	struct resource *res;
  	int ret;
  	const char *strings = NULL;
@@ -1718,13 +1720,30 @@ static struct rk_priv_data *rk_gmac_setup(struct 
platform_device *pdev,
  			bsp_priv->clock_input = false;
  	}

+	of_internal_delay_ps = false;
+
+	if (device_is_compatible(dev, "rockchip,rk3588-gmac")) {
+		ret = of_property_read_u32(dev->of_node, "tx-internal-delay-ps", &value);
+		if (!ret) {
+			/* Empirical data only available for RK3588 on TX path */
+			bsp_priv->tx_delay_from_ps = value / 31;
+			of_internal_delay_ps = true;
+		}
+	}
+
  	ret = of_property_read_u32(dev->of_node, "tx_delay", &value);
  	if (ret) {
  		bsp_priv->tx_delay = 0x30;
-		dev_err(dev, "Can not read property: tx_delay.");
-		dev_err(dev, "set tx_delay to 0x%x\n",
-			bsp_priv->tx_delay);
+		if (!of_internal_delay_ps) {
+			dev_err(dev, "Can not read property: tx_delay.");
+			dev_err(dev, "set tx_delay to 0x%x\n",
+				bsp_priv->tx_delay);
+		}
  	} else {
+		if (of_internal_delay_ps) {
+			dev_err_probe(dev, -EINVAL, "Only one of tx_delay and 
tx-internal-delay-ps must exist\n");
+			return ERR_PTR(-EINVAL);
+		}
  		dev_info(dev, "TX delay(0x%x).\n", value);
  		bsp_priv->tx_delay = value;
  	}
@@ -1821,7 +1840,7 @@ static int rk_gmac_powerup(struct rk_priv_data 
*bsp_priv)
  		break;
  	case PHY_INTERFACE_MODE_RGMII_ID:
  		dev_info(dev, "init for RGMII_ID\n");
-		bsp_priv->ops->set_to_rgmii(bsp_priv, 0, 0);
+		bsp_priv->ops->set_to_rgmii(bsp_priv, bsp_priv->tx_delay_from_ps, 0);
  		break;
  	case PHY_INTERFACE_MODE_RGMII_RXID:
  		dev_info(dev, "init for RGMII_RXID\n");
@@ -1829,7 +1848,7 @@ static int rk_gmac_powerup(struct rk_priv_data 
*bsp_priv)
  		break;
  	case PHY_INTERFACE_MODE_RGMII_TXID:
  		dev_info(dev, "init for RGMII_TXID\n");
-		bsp_priv->ops->set_to_rgmii(bsp_priv, 0, bsp_priv->rx_delay);
+		bsp_priv->ops->set_to_rgmii(bsp_priv, bsp_priv->tx_delay_from_ps, 
bsp_priv->rx_delay);
  		break;
  	case PHY_INTERFACE_MODE_RMII:
  		dev_info(dev, "init for RMII\n");
"""

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ