lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Nov 2020 23:43:01 +0100
From:   Andrew Lunn <andrew@...n.ch>
To:     Arnd Bergmann <arnd@...nel.org>
Cc:     Jernej Škrabec <jernej.skrabec@...il.com>,
        Ard Biesheuvel <ardb@...nel.org>,
        Daniel Thompson <daniel.thompson@...aro.org>,
        Sumit Garg <sumit.garg@...aro.org>,
        Alex Bennée <alex.bennee@...aro.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Steve McIntyre <steve@...val.com>,
        "open list:BPF JIT for MIPS (32-BIT AND 64-BIT)" 
        <netdev@...r.kernel.org>, Willy Liu <willy.liu@...ltek.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Sasha Levin <sashal@...nel.org>,
        Florian Fainelli <f.fainelli@...il.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Masahisa Kojima <masahisa.kojima@...aro.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>
Subject: Re: Re: realtek PHY commit bbc4d71d63549 causes regression

Hi Arnd

> Something of that sort. I also see a similar patch in KSZ9031
> now, see 7dd8f0ba88fc ("arm: dts: imx6qdl-udoo: fix rgmii phy-mode
> for ksz9031 phy")
> 
> As this exact mismatch between rgmii and rgmii-id mode is apparently
> a more widespread problem, the best workaround I can think of
> is that we redefine the phy-mode="rgmii" property to actually mean
> "use rgmii mode and let the phy driver decide the delay configuration",

The problem is, the PHY driver has no idea what the delay
configuration should be. That is the whole point of the DT property.

The MAC and the PHY have to work together to ensure one of them
inserts the delay. In most cases, the MAC driver reads the property
and passes it unmodified to the PHY. The PHY then does what it is
told. In some cases, the MAC decides to add the delay, it changes the
rgmii-id to rgmii before passing it onto the PHY. The PHY does as it
is told, and it works. And a very small number of boards simply have
longer clock lines than signal lines, so the PCB adds the delay. It is
not clearly defined how that should be described in DT, but it works
so far because most MAC drivers don't add delays, pass the 'rgmii'
from DT to the PHY and it does as it is told and does not add delays.

There is one more case, which is not used very often. The PHY is
passed the NA values, which means, don't touch, something else has set
it up. So when the straps are doing the correct thing, you could pass
NA. However, some MAC drivers look at the phy mode, see it is one of
the 4 rgmii modes, and configure their end to rgmii, instead of gmii,
mii, sgmii, etc. How networking does ACPI is still very undefined, but
i think we need to push for ACPI to pass NA, and the firmware does all
the setup. That seems to be ACPI way.

> with a new string to mean specifically rgmii mode with no delay.

As you said, we have phy-mode="rgmii" 235 times. How many of those are
going to break when you change the definition of rgmii?  I have no
idea, but my gut feeling is more than the number of boards which are
currently broken because of the problem with this PHY.

And, as i said above, some MAC drivers look for one of the 4 RGMII
modes in order to configure their side. If you add a new string, you
need to review all the MAC drivers and make sure they check for all 5
strings, not 4. Some of that is easy, modify
phy_interface_mode_is_rgmii(), but not all MAC use it, and it is no
help in a switch statement.

And we are potentially going to get into the same problem
again. History has shown, we cannot get 4 properties right. Do you
think we will do any better getting 5 properties right? Especially
when phy-mode="rgmii" does not mean rgmii, but do whatever you think
might be correct?

Having suffered the pain from the Atheros PHY, this is something i
review much more closely, so hopefully we are getting better at
this. But PHY drivers live for a long time, ksz9031 was added 7 years
ago, well before we started looking closely at delays. I expect more
similar problems will keep being found over the next decade.

To some extent, we actually need DT writers to properly test their
DT. If both rgmii and rgmii-id works, there is a 50% chance whatever
they pick is wrong. And it would be nice if they told the networking
people so we can fix the PHY.

       Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ