netdev - Re: phylink / mv8e6xxx and SGMII aneg not working, link doesn't come up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190227225221.vubwqa6vg5odq26w@shell.armlinux.org.uk>
Date:   Wed, 27 Feb 2019 22:52:21 +0000
From:   Russell King - ARM Linux admin <linux@...linux.org.uk>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: phylink / mv8e6xxx and SGMII aneg not working, link doesn't come
 up

On Wed, Feb 27, 2019 at 10:25:54PM +0100, Heiner Kallweit wrote:
> For the ones who don't know my story yet:
> I have a Marvell 88E6390 switch with a Clause 45 PHY connected via
> SGMII to port 9. For other reasons I limit the PHY to 100Mbps currently.
> DT says: managed = "in-band-status"
> Driver is mv8e6xxx + phylink.
> Problem is that the link doesn't come up. Debugging resulted in:
> 
> PHY establishes link to link partner (100Mbps, FD) and fires the
> "link up" interrupt. The "link up" is also properly transmitted via
> SGMII in-band signalling and fires the SERDES interrupt in mv8e6xxx.
> Problem is that the in-band transmitted values for speed and duplex
> don't show up anywhere. And phylink_resolve() doesn't even print
> the link-up message.
> 
> First question: Both, PHY and SERDES interrupt, try to do the same
> thing: they eventually call phylink_run_resolve(). Is this correct?

Correct - when the PHY interrupts, it triggers phylib to read the
status and issue a callback via the link status hook into phylink,
which stores the information from the PHY.  That triggers a resolve.
Even though the PHY reports that it has established link, the link
may not yet be up.

The Serdes is the MAC side of the link, and that should be calling
phylink_mac_change().  That will re-run the resolve.

Each time the resolve is triggered in "in-band" mode, we expect to
read the link status from the MAC side of the link, along with the
speed and duplex.  Since SGMII does not carry the flow control
parameters, if a PHY is present, we merge the PHYs flow control 
information and pass that back to the MAC to configure the MAC's
flow control to match.

> I have some problems with understanding the code for MLO_AN_INBAND
> in phylink_resolve(). Maybe also something is missing there for
> proper in-band aneg support.

Nope, it works with mvneta and mvpp2.

> First we get the mac state (which is link down, 10Mbps, HD in my case).
> Then the link state is calculated as "mac link up" && "phy link up".
> This results in "link down", because mac link is down and phy link is
> up. This logic isn't clear to me. How is the "link up" info supposed
> to ever reach the mac?

Via the in-band status, and the MAC detecting that the link is now up.

> We're reading the port status, but IMO we want to set it based on the
> auto-negotiated settings we get from the PHY.

No we don't - in SGMII in-band mode, the PHY is optional, and even if
it is present, if the MAC reports that the link is down even if the PHY
reports that the link is up, then the link is _not_ up.

If you want to use SGMII without in-band, then phylink supports that
(which is PHY mode).  If the MAC is unable to report these parameters,
then in-band mode is not appropriate.

> Later pause settings and possibly changed interface mode are
> propagated to the mac. But no word about speed and duplex.
> 
> Via "some callback" "some code" should read the in-band-transmitted
> speed and duplex values from the SGMII port and use them to configure
> the mac. Is this simply missing or do I miss something?

>From what you've described, it sounds like what you actually have is:

	MAC <---> Serdes PHY <---> PHY

The Serdes PHY receives the SGMII in-band negotiation from the external
PHY, but there is no propagation of the status from the serdes PHY to
the MAC.  What's more is the Serdes PHY is hidden in mv88e6xxx setups.

This is a classic stacked PHY setup, one which we're now starting to
encounter, and we need to support it - we have three cases I'm aware of
now where we have:

	MAC <---> PHY <---> SFP

and "SFP" can be another PHY, so we end up with:

	MAC <---> PHY <---> PHY

and phylib does not support having two PHYs on the same net_device.

What's more is that the need for an "attached_dev" net_device is pretty
heavily embedded into phylib, so attaching one PHY to another PHY
doesn't work very well.

I've been tinkering with some patches to abstract out the "attached_dev"
stuff, but in doing so I've walked into a whole load of issues in
phy_attach_direct() and phy_detach() - it looks like phy_attach_direct()
is lacking any form of locking while binding one of the generic drivers
(it needs to hold the device lock while doing so.)  The comments around
device_release_driver() are vague about whether the parent device needs
to be locked.  Questions have been asked about that but so far I haven't
had a response.

We can replace a load of phydev->attached_dev != NULL tests (which are
checking whether the device is attached to a netdev) fairly easily,
but there's a load of places which do (phydev->attached_dev &&
phydev->adjust_link) - and I fear that phylink (which doesn't set
phydev->adjust_link) results in this code being broken.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up