lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 18 Jun 2020 11:06:43 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Helmut Grohne <helmut.grohne@...enta.de>
Cc:     Nicolas Ferre <nicolas.ferre@...rochip.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Palmer Dabbelt <palmer@...belt.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] net: macb: reject unsupported rgmii delays



On 6/18/2020 1:45 AM, Russell King - ARM Linux admin wrote:
> On Thu, Jun 18, 2020 at 10:14:33AM +0200, Helmut Grohne wrote:
>> On Wed, Jun 17, 2020 at 02:08:09PM +0200, Russell King - ARM Linux admin wrote:
>>> With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC
>>> setup; we just don't know.  However, we don't have is access to the
>>> PHY (if it exists) in the fixed link case to configure it for the
>>> delay.
>>
>> Let me twist that a little: We may have access to the PHY, but we don't
>> always have access. When we do have access, we have a separate device
>> tree node with another fixed-link and another phy-mode. For fixed-links,
>> we specify the phy-mode for each end.
> 
> If you have access to the PHY, you shouldn't be using fixed-link. In
> any case, there is no support for a fixed-link specification at the
> PHY end in the kernel.  The PHY binding doc does not allow for this
> either.
> 
>>> In the MAC-to-MAC RGMII setup, where neither MAC can insert the
>>> necessary delay, the only way to have a RGMII conformant link is to
>>> have the PCB traces induce the necessary delay. That errs towards
>>> PHY_INTERFACE_MODE_RGMII for this case.
>>
>> Yes.
>>
>>> However, considering the MAC-to-PHY RGMII fixed link case, where the
>>> PHY may not be accessible, and may be configured with the necessary
>>> delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly
>>> that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would
>>> be for the MAC-to-MAC RGMII with PCB-delays case.
>>
>> If you take into account that the PHY has a separate node with phy-mode
>> being rgmii-id, it makes a lot more sense to use rgmii for the phy-mode
>> of the MAC. So I don't think it is that clear that doing so is wrong.
> 
> The PHY binding document does not allow for this, neither does the
> kernel.
> 
>> In an earlier discussion Florian Fainelli said:
>> https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/#2149183
>> | fixed-link really should denote a MAC to MAC connection so if you have
>> | "rgmii-id" on one side, you would expect "rgmii" on the other side
>> | (unless PCB traces account for delays, grrr).
>>
>> For these reasons, I still think that rgmii would be a useful
>> description for the fixed-link to PHY connection where the PHY adds the
>> delay.
> 
> I think Florian is wrong; consider what it means when you have a
> fixed-link connection between a MAC and an inaccessible PHY and
> the link as a whole is operating in what we would call "rgmii-id"
> mode if the PHY were accessible.
> 
> Taking Florian's stance, it basically means that DT no longer
> describes the hardware, but how we have chosen to interpret the
> properties in _one_ specific case in a completely different way
> to all the other cases.

How do you ensure that a MAC to MAC connection using RGMII actually
works if you do not provide a 'phy-mode' for both sides right now?

Yes this is more of a DT now describes a configuration choice rather
than the hardware but given that Ethernet MACs are usually capable of
supporting all 4 possible RGMII modes, how do you propose to solve the
negotiation of the appropriate RGMII mode here?

> 
>>> So, I think a MAC driver should not care about the specific RGMII
>>> mode being asked for in any case, and just accept them all.
>>
>> I would like to agree to this. However, the implication is that when you
>> get your delays wrong, your driver silently ignores you and you never
>> notice your mistake until you see no traffic passing and wonder why.
> 
> So explain to me this aspect of your reasoning:
> 
> - If the link is operating in non-fixed-link mode, the rgmii* PHY modes
>   describe the delay to be applied at the PHY end.
> - If the link is operating in fixed-link mode, the rgmii* PHY modes
>   describe the delay to be applied at the MAC end.
> 
> That doesn't make sense, and excludes properly describing a MAC-to-
> inaccessible-PHY setup.

The point with a fixed link is that it does not matter what is the other
end, it could be a MAC, it could be a PHY, it could go nowhere, it just
does not matter, the only thing that you can know is you configure your
side of the fixed link. If you have visibility into the other side as
well, you can go one step further an put something in the other
fixed-link node that makes sense and will result in a working RGMII link.

> 
> It also means that we're having to conditionalise how we deal with
> this PHY mode in every single driver, which means that every single
> driver is going to end up interpreting it differently, and it's going
> to become a buggy mess.

Newsflash: it is already a buggy mess.

> 
>> In this case, I was faced with a PHY that would do rgmii-txid and I
>> configured that on the MAC. Unfortunately, macb_main.c didn't tell me
>> that it did rgmii-id instead.
> 
> The documentation makes it clear that "rgmii-*" (note the hyphen) are
> to be applied by the PHY *only*, and not by the MAC.
> 
>>> I also think that some of this ought to be put in the documentation
>>> as guidance for new implementations.
>>
>> That seems to be the part where everyone agrees.
>>
>> Given the state of the discussion, I'm wondering whether this could be
>> fixed at a more fundamental level in the device tree bindings.
>>
>> A number of people (including you) pointed out that retroactively fixing
>> the meaning of phy modes does not work and causes pain instead. That
>> hints that the only way to fix this is adding new properties. How about
>> this?
>>
>> rgmii-delay-type:
>>   description:
>>     Responsibility for adding the rgmii delay
>>   enum:
>>     # The remote PHY or MAC to this MAC is responsible for adding the
>>     # delay.
>>     - remote
>>     # The delay is added by neither MAC nor MAC, but using PCB traces
>>     # instead.
>>     - pcb
>>     # The MAC must add the delay.
>>     - local
> 
> Why do we need that complexity?  If we decide that we can allow
> phy-mode = "rgmii" and introduce new properties to control the
> delay, then we just need:
> 
>   rgmii-tx-delay-ps = <nnn>;
>   rgmii-rx-delay-ps = <nnn>;
> 
> specified in the MAC node (to be applied only at the MAC end) or
> specified in the PHY node (to be applied only at the PHY end.)
> In the normal case, this would be the standard delay value, but
> in exceptional cases where supported, the delays can be arbitary.
> We know there are PHYs out there which allow other delays.
> 
> This means each end is responsible for parsing these properties in
> its own node and applying them - or raising an error if they can't
> be supported.

And as we all know, RGMII is the most plug and play standard out there
so this is "just going to work".

> 
> With your "rgmii-delay-type" idea, if this is only specified at the
> MAC end, then the PHY code somehow needs to know what that setting is,
> which adds a lot of complexity - the PHY code has to go digging for
> the MAC node, we may even have to introduce a back-reference from the
> PHY node to the MAC node so the PHY can find it.  There are MAC drivers
> out there where there is one struct device, but multiple net devices,
> so digging inside struct net_device to get at the parent struct device,
> and then trying to parse "rgmii-delay-type" from the of-node won't work.
>-- 
Florian

Powered by blists - more mailing lists