lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXGnfsX1pH8m1eO-B1nAqL=vMeuw6fpYdeA1RqMpSrg66Q@mail.gmail.com>
Date:   Fri, 13 Nov 2020 23:49:39 +0100
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Arnd Bergmann <arnd@...nel.org>,
        Jernej Škrabec <jernej.skrabec@...il.com>,
        Daniel Thompson <daniel.thompson@...aro.org>,
        Sumit Garg <sumit.garg@...aro.org>,
        Alex Bennée <alex.bennee@...aro.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Steve McIntyre <steve@...val.com>,
        "open list:BPF JIT for MIPS (32-BIT AND 64-BIT)" 
        <netdev@...r.kernel.org>, Willy Liu <willy.liu@...ltek.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Sasha Levin <sashal@...nel.org>,
        Florian Fainelli <f.fainelli@...il.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Masahisa Kojima <masahisa.kojima@...aro.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>
Subject: Re: Re: realtek PHY commit bbc4d71d63549 causes regression

On Fri, 13 Nov 2020 at 23:43, Andrew Lunn <andrew@...n.ch> wrote:
>
> Hi Arnd
>
> > Something of that sort. I also see a similar patch in KSZ9031
> > now, see 7dd8f0ba88fc ("arm: dts: imx6qdl-udoo: fix rgmii phy-mode
> > for ksz9031 phy")
> >
> > As this exact mismatch between rgmii and rgmii-id mode is apparently
> > a more widespread problem, the best workaround I can think of
> > is that we redefine the phy-mode="rgmii" property to actually mean
> > "use rgmii mode and let the phy driver decide the delay configuration",
>
> The problem is, the PHY driver has no idea what the delay
> configuration should be. That is the whole point of the DT property.
>
> The MAC and the PHY have to work together to ensure one of them
> inserts the delay. In most cases, the MAC driver reads the property
> and passes it unmodified to the PHY. The PHY then does what it is
> told. In some cases, the MAC decides to add the delay, it changes the
> rgmii-id to rgmii before passing it onto the PHY. The PHY does as it
> is told, and it works. And a very small number of boards simply have
> longer clock lines than signal lines, so the PCB adds the delay. It is
> not clearly defined how that should be described in DT, but it works
> so far because most MAC drivers don't add delays, pass the 'rgmii'
> from DT to the PHY and it does as it is told and does not add delays.
>
> There is one more case, which is not used very often. The PHY is
> passed the NA values, which means, don't touch, something else has set
> it up. So when the straps are doing the correct thing, you could pass
> NA. However, some MAC drivers look at the phy mode, see it is one of
> the 4 rgmii modes, and configure their end to rgmii, instead of gmii,
> mii, sgmii, etc. How networking does ACPI is still very undefined, but
> i think we need to push for ACPI to pass NA, and the firmware does all
> the setup. That seems to be ACPI way.
>
> > with a new string to mean specifically rgmii mode with no delay.
>
> As you said, we have phy-mode="rgmii" 235 times. How many of those are
> going to break when you change the definition of rgmii?  I have no
> idea, but my gut feeling is more than the number of boards which are
> currently broken because of the problem with this PHY.
>
> And, as i said above, some MAC drivers look for one of the 4 RGMII
> modes in order to configure their side. If you add a new string, you
> need to review all the MAC drivers and make sure they check for all 5
> strings, not 4. Some of that is easy, modify
> phy_interface_mode_is_rgmii(), but not all MAC use it, and it is no
> help in a switch statement.
>
> And we are potentially going to get into the same problem
> again. History has shown, we cannot get 4 properties right. Do you
> think we will do any better getting 5 properties right? Especially
> when phy-mode="rgmii" does not mean rgmii, but do whatever you think
> might be correct?
>
> Having suffered the pain from the Atheros PHY, this is something i
> review much more closely, so hopefully we are getting better at
> this. But PHY drivers live for a long time, ksz9031 was added 7 years
> ago, well before we started looking closely at delays. I expect more
> similar problems will keep being found over the next decade.
>
> To some extent, we actually need DT writers to properly test their
> DT. If both rgmii and rgmii-id works, there is a 50% chance whatever
> they pick is wrong. And it would be nice if they told the networking
> people so we can fix the PHY.
>

One question that still has not been answered is how many actual
platforms were fixed by backporting Realtek's follow up fix to
-stable. My suspicion is none. That by itself should be enough
justification to revert the backport of that change.

I do agree that we should fix this properly going forward, and if we
do manage to fix this in a backwards compatible way, we should
backport that fix. But letting the current situation exist because
nobody can be bothered to fix it properly is not the right solution
IMHO.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ