linux-kernel - Re: [PATCH 2/2] arm64: dts: rockchip: support Ethernet Switch adapter for RK3588 Jaguar

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eac73db9-5747-44a8-aa5e-f1e3edd25e8a@cherry.de>
Date: Fri, 23 May 2025 18:47:48 +0200
From: Quentin Schulz <quentin.schulz@...rry.de>
To: Andrew Lunn <andrew@...n.ch>
Cc: Quentin Schulz <foss+kernel@...il.net>, Rob Herring <robh@...nel.org>,
 Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley
 <conor+dt@...nel.org>, Heiko Stuebner <heiko@...ech.de>,
 Jakob Unterwurzacher <jakob.unterwurzacher@...rry.de>,
 devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 linux-rockchip@...ts.infradead.org, linux-kernel@...r.kernel.org,
 Kever Yang <kever.yang@...k-chips.com>
Subject: Re: [PATCH 2/2] arm64: dts: rockchip: support Ethernet Switch adapter
 for RK3588 Jaguar

Hi Andrew,

On 5/22/25 6:59 PM, Andrew Lunn wrote:
>> Would the suggested change in the previous answer be acceptable to you?
>>
>> @Kever, is there any way to know what the register values for
>> rx_delay/tx_delay actually mean in terms of picoseconds delay added by the
>> MAC maybe?
> 
> The problem is, what exactly do these values mean? Is it documented
> somewhere?
> 

Those aren't documented in Rockchip's TRM (Technical Reference Manual) 
as far as I could tell, hence why I asked for Kever's input on that. We 
could empirically find direct correlation to a specific value but I 
cannot do this for every version of the IP we have in all Rockchip SoCs 
as I only have access to a handful :)

>>>>> Since this has a switch on the other end, its a bit more complicated
>>>>> with RGMII delays. Normally, the MAC does nothing and passed rgmii-id
>>>>> to the PHY, and the PHY then does the delays. However, here you don't
>>>>> have a PHY. So you have the MAC add the delays. This looks O.K. I
>>>>
>>>> The switch actually supports adding delays on the port used for DSA conduit.
>>>
>>> That actually looks to be the simplest and correct solution. Set the
>>> MAC to 'rgmii-id', rx_delay and tx_delay to 0, even if they are
>>> ignored. And in the switch, also 'rgmii-id' and let it insert the 2ns
>>> delay. You can use rx-internal-delay-ps and tx-internal-delay-ps if
>>> you want, but it seems to default to sensible values.
>>>
>>
>> I don't have control on how much is inserted by the PHY as opposed to the
>> MAC, so I'm wary of using a much less precise (on paper) delay.
> 
> Experience i've had with this is that you don't need to be too
> accurate. Devices generally work with 2ns.
> 

I see, I think I took this 2ns too critically. I thought we really were 
supposed to be close to it but thanks to the tip about the RGMII spec[1] 
we can see that the delay at transmitter needs to be between 1.2 and 2ns 
while at the receiver it needs to be between 1 and 2ns. So "at least 
1.5ns" granularity in the PHY is good enough for us :) (and we verified 
it works of course :) )

[1] https://etherealwake.com/2025/03/ethernet-rgmii/rgmii_2_0.pdf

>> I have no
>> clue if doing this in the PHY is going to put us in the center of the eye or
>> not. Thanks to Rockchip's kernel tool, we are expecting to be in the center
>> of the eye right now.
> 
> What exactly does the tool do? Can you run it when the 'PHY' is adding
> the delay and see how good the eye alignment is?
> 
> arch$ grep -hr "tx_delay =" | sort | uniq -c
>        1 	tx_delay = <0x0>;
>        4 	tx_delay = <0x10>;
>        1 	tx_delay = <0x1a>;
>        1 	tx_delay = <0x20>;
>        3 	tx_delay = <0x21>;
>        2 	tx_delay = <0x22>;
>        5 	tx_delay = <0x24>;
>        2 	tx_delay = <0x26>;
>       15 	tx_delay = <0x28>;
>        2 	tx_delay = <0x2a>;
>       17 	tx_delay = <0x30>;
>        1 	tx_delay = <0x3a>;
>        3 	tx_delay = <0x3c>;
>        3 	tx_delay = <0x42>;
>        5 	tx_delay = <0x43>;
>        2 	tx_delay = <0x44>;
>        1 	tx_delay = <0x45>;
>        1 	tx_delay = <0x46>;
>        6 	tx_delay = <0x4f>;
> 
> So 0x30 is the most popular, and i expect it maps to 2ns. The 0x28
> value is interesting, given that 0x2a is not used much. That makes me
> think there might be a common PHY used with these boards which has a
> small built in delay when it should not.
>   

Note that Rockchip SoCs typically have integrated PHYs as well (which we 
don't use at CHERRY but the option is there). The answer otherwise is 
usually of the kind of "we followed the reference design, it works with 
the DT from the evaluation board, this DT was accepted, we do the same". 
For contributors, I don't think it's feasible to figure out whether the 
PCB lane length matching is done properly or how much is actually 
required except by modifying the delay by hand and find both extremes 
where it stops working and empirically decide the eye is in the middle 
of those (for your board, which may differ from another board of the 
same kind).

>>>> I'm a bit confused by the following sentence:
>>>>
>>>> """
>>>> Normally, the MAC does nothing and passed rgmii-id
>>>> """
>>>>
>>>> is this something that the MAC driver is supposed to do or is the subsystem
>>>> handling that somehow? How do I know how/when to rewrite the phy-mode passed
>>>> to the PHY?
>>>
>>> A small number of MACs have hard coded delays. You cannot turn the
>>> delay off. So the MAC has no choice but to do the delay. 'rgmii' is
>>> simply not possible, so -EINVAL. For 'rgmii-id', if you pass that to
>>> the PHY, it will also add a delay, and 4ns in total does not work. So
>>> when the MAC is adding delays, it needs to mask out the delays it is
>>> adding before calling phy_attach() passing an rgmii mode.
>>>
>>
>> If I understand correctly, if phy-mode is rgmii-id in DT and the MAC is
>> adding the delay, I need to force PHY_INTERFACE_MODE_RGMII phy-mode when
>> attaching the PHY in the MAC device.
> 
> Correct. DT phy-mode describes the PCB. Does the PCB add the 2ns
> delay.
> 
> Once you get to the MAC/PHY pair, what the MAC passes to the PHY is
> about what remaining delays need adding. It could be nothing, because
> the PCB adds the delay. It could be all of it, because neither the PCB
> nor the MAC add the delays. It can also be nothing because the MAC has
> already added the delays.
> 

Clear thanks. I'm not sure whether Linux already has a way for the MAC 
to provide which delay the PHY is supposed to do? I (now) know we can 
override the phy_mode via one of phy_attach*/phy_connect* function which 
take a phy_interface_t as argument. I see a phy_get_internal_delay but 
not the setter part of it. Anyway, don't want to drag this too much as I 
don't need this anymore since the delay will be added by the PHY.

> For linux, we have the policy that the PHY adds the delay, in an
> attempt to try to make all systems the same. And most boards follow
> this. And then we have a few systems that totally mess up delays, have
> odd configuration knobs nobody knows what the do exactly, and put the
> delays in the MAC.
> 
>> Does this also mean you cannot have mixed addition of delay? E.g. having 1ns
>> from MAC and 1ns from PHY? It has to be only on one of the IC?
> 
> It is not recommended, because of the policy that the PHY does the
> delay... You can, if you make use of the rx-internal-delay-ps
> properties, assuming both the MAC and PHY support them.
> 

Thanks to your and Jakob (in private)'s explanations, I now understand 
that this is very likely not required as I would assume HW engineers for 
the IC would essentially allow to add some delay between 1.2ns and 2ns 
to satisfy the RGMII v2.0 spec but there's no need to be precise or have 
granularity. So this scenario is theoretical and could very likely be 
due to accumulated HW mistakes (either in ICs or routing)?

>> In the comment at the bottom of the dt binding there's this following
>> sentence:
>>
>> # When the PCB does not implement the delays, the MAC or PHY must.  As
>> # such, this is software configuration, and so not described in Device
>> # Tree.
>>
>> Why do we have two possible locations for rx-internal-delay-ps: PHY and MAC?
> 
> Sometimes both can add variable delays. Sometimes it is fixed 2ns.
> 
> I would actually prefer that these properties were not used, because

I assume we would therefore need some kind of negotiation between the 
PHY and the MAC at the subsystem level to figure out if the PHY can do 
the requested delay or if it should be handled by the MAC instead?

> they indicate somebody messed up. If you read the RGMII standard, it
> says a 2ns delay is required between the clock and the data. It is as
> simple as that. If you need to fine tune it, it means one of:
> 
> The PCB is badly designed, care was not taken to ensure the PCB tracks
> are the same length.
> 
> The MAC is badly designed, it does not add 0ns/2ns, but some other
> delay.
> 
> The PHY is badly designed, it does not add 0ns/2ns, but some other
> delay.
> 

Just need to be in the 1-2ns but otherwise I fully agree, thanks for the 
hint of the RGMII spec that helped!

Thanks for taking the time to write this all down and your patience, 
that was all very helpful!

Cheers,
Quentin