lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87msg66uh4.fsf@waldekranz.com>
Date: Sun, 05 Jan 2025 00:16:07 +0100
From: Tobias Waldekranz <tobias@...dekranz.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: davem@...emloft.net, kuba@...nel.org, andrew@...n.ch,
 f.fainelli@...il.com, olteanv@...il.com, netdev@...r.kernel.org,
 chris.packham@...iedtelesis.co.nz, pabeni@...hat.com, marek.behun@....cz
Subject: Re: [PATCH v2 net 3/4] net: dsa: mv88e6xxx: Never force link on
 in-band managed MACs

On lör, jan 04, 2025 at 22:09, "Russell King (Oracle)" <linux@...linux.org.uk> wrote:
> On Sat, Jan 04, 2025 at 10:37:00PM +0100, Tobias Waldekranz wrote:
>> On tor, jan 02, 2025 at 17:08, "Russell King (Oracle)" <linux@...linux.org.uk> wrote:
>> > On Thu, Jan 02, 2025 at 02:06:32PM +0100, Tobias Waldekranz wrote:
>> >> On tor, jan 02, 2025 at 10:31, "Russell King (Oracle)" <linux@...linux.org.uk> wrote:
>> >> > On Thu, Dec 19, 2024 at 01:30:42PM +0100, Tobias Waldekranz wrote:
>> >> >> NOTE: This issue was addressed in the referenced commit, but a
>> >> >> conservative approach was chosen, where only 6095, 6097 and 6185 got
>> >> >> the fix.
>> >> >> 
>> >> >> Before the referenced commit, in the following setup, when the PHY
>> >> >> detected loss of link on the MDI, mv88e6xxx would force the MAC
>> >> >> down. If the MDI-side link was then re-established later on, there was
>> >> >> no longer any MII link over which the PHY could communicate that
>> >> >> information back to the MAC.
>> >> >> 
>> >> >>         .-SGMII/USXGMII
>> >> >>         |
>> >> >> .-----. v .-----.   .--------------.
>> >> >> | MAC +---+ PHY +---+ MDI (Cu/SFP) |
>> >> >> '-----'   '-----'   '--------------'
>> >> >> 
>> >> >> Since this a generic problem on all MACs connected to a SERDES - which
>> >> >> is the only time when in-band-status is used - move all chips to a
>> >> >> common mv88e6xxx_port_sync_link() implementation which avoids forcing
>> >> >> links on _all_ in-band managed ports.
>> >> >> 
>> >> >> Fixes: 4efe76629036 ("net: dsa: mv88e6xxx: Don't force link when using in-band-status")
>> >> >> Signed-off-by: Tobias Waldekranz <tobias@...dekranz.com>
>> >> >
>> >> > I'm feeling uneasy about this change.
>> >> >
>> >> > The history of the patch you refer to is - original v1:
>> >> >
>> >> > https://lore.kernel.org/r/20201013021858.20530-2-chris.packham@alliedtelesis.co.nz
>> >> >
>> >> > When v3 was submitted, it was unchanged:
>> >> >
>> >> > https://lore.kernel.org/r/20201020034558.19438-2-chris.packham@alliedtelesis.co.nz
>> >> >
>> >> > Both of these applied the in-band-status thing to all Marvell DSA
>> >> > switches, but as Marek states here:
>> >> >
>> >> > https://lore.kernel.org/r/20201020165115.3ecfd601@nic.cz
>> >> 
>> >> Thanks for that context!
>> >> 
>> >> > doing so breaks last least one Marvell DSA switch (88E6390). Hence why
>> >> > this approach is taken, rather than not forcing the link status on all
>> >> > DSA switches.
>> >> >
>> >> > Your patch appears to be reverting us back to what was effectively in
>> >> > Chris' v1 patch from back then, so I don't think we can accept this
>> >> > change. Sorry.
>> >> 
>> >> Before I abandon this broader fix, maybe you can help me understand
>> >> something:
>> >> 
>> >> If a user explicitly selects `managed = "in-band-status"`, why would we
>> >> ever interpret that as "let's force the MAC's settings according to what
>> >> the PHY says"? Is that not what `managed = "auto"` is for?
>> >
>> > You seem confused with that point, somehow confusing the calls to
>> > mac_link_up()/mac_link_down() when using in-band-status with something
>> > that a PHY would indicate. No, that's just wrong.
>> >
>> > If using in-band-status, these calls will be made in response to what
>> > the PCS says the link state is, possibly in conjunction with a PHY if
>> > there is a PHY present. Whether the PCS state gets forwarded to the MAC
>> > is hardware specific, and we have at least one DSA switch where this
>> > doesn't appear happen.
>> >
>> > Please realise that there are _three_ distinct modules here:
>> >
>> > - The MAC
>> > - The PCS
>> > - The PHY or media
>> 
>> Right, I sloppily used "PHY" to refer to the link partner on the other
>> end of the SERDES.  I realize that the remote PCS does not have to
>> reside within a PHY.
>
> Sigh, it seems I'm not making myself clear.
>
> Host system:
>
>   ---------------------------+
>     NIC (or DSA switch port) |
>      +-------+    +-------+  |
>      |       |    |       |  |
>      |  MAC  <---->  PCS  <-----------------------> PHY, SFP or media
>      |       |    |       |  |     ^
>      +-------+    +-------+  |     |
>                              |   phy interface type
>   ---------------------------+   also in-band signalling
>                                  which managed = "in-band-status"
> 				 applies to

This part is 100% clear

>> E.g. what does it mean to have an SGMII link where in-band signaling is
>> not used?  Is that not part of what defines SGMII?
>
> There _are_ PHYs out there that implement Cisco SGMII (which is IEEE
> 802.3 1000BASE-X modified to allow signalling at 10M and 100M speeds by
> symbol replication, and changing the format of the 1000BASE-X to provide
> the details of the SGMII link speed and duplex) but do _not_ support
> that in-band signalling.
>
> The point of SGMII without in-band signalling rather than just using
> 1000BASE-X without in-band signalling is that SGMII can operate at
> 10M and 100M, whereas 1000BASE-X can not.
>
> The usual situation, however, is that most devices that support Cisco
> SGMII also allow the in-band signalling to be configured to be used or
> not used.

Yes, I know about the relationship between 1000BASE-X and SGMII, I just
did not know that there were devices that only implemented the symbol
replication part.

> Going back to the diagram above, the link between the MAC and PCS is
> _not_ described in DT currently, not by the managed property not by
> the phy-modes etc properties.

Clear.

> Now, the port configuration register on the Marvell switches controls
> the MAC settings. The PCS has a separate register set (normally
> referred to as serdes in Marvell's Switch terminology) which is an
> IEEE compliant clause 22 register layout.

...or Clause 45 on the Amethyst - sure.  I am quite familiar with these
devices.  That is not the source of my confusion.

> The problem is, it seems *some* Marvell switches automatically forward
> the PCS status to the MAC. Other switches do not. The DT "managed"

Yes, I understand this problem, and that that is the reason for
rejecting the patch.

> property does not describe this - because - as stated above - the
> "managed" property applies to the link between the PCS and external
> world (which may be a PHY, or may be media) and _not_ between the
> MAC and its associated PCS.

I understand _where_ it applies and _what_ it describes, but I do not
understand _how_ a driver should make use of it.

In other words, my question is:

For a NIC driver to properly support the `managed` property, how should
the setup and/or runtime behavior of the hardware and/or the driver
differ with the two following configs?

    &eth0 {
        phy-connection-type = "sgmii";
        managed = "auto";
    };

vs.

    &eth0 {
        phy-connection-type = "sgmii";
        managed = "in-band-status";
    };

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ