lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200717212605.GM1551@shell.armlinux.org.uk>
Date:   Fri, 17 Jul 2020 22:26:05 +0100
From:   Russell King - ARM Linux admin <linux@...linux.org.uk>
To:     Andrew Lunn <andrew@...n.ch>, Martin Rowe <martin.p.rowe@...il.com>
Cc:     netdev@...r.kernel.org, davem@...emloft.net,
        vivien.didelot@...il.com
Subject: Re: bug: net: dsa: mv88e6xxx: unable to tx or rx with Clearfog GT 8K
 (with git bisect)

On Fri, Jul 17, 2020 at 09:42:37PM +0200, Andrew Lunn wrote:
> On Fri, Jul 17, 2020 at 07:51:19PM +0100, Russell King - ARM Linux admin wrote:
> > On Fri, Jul 17, 2020 at 12:50:07PM +0000, Martin Rowe wrote:
> > > On Fri, 17 Jul 2020 at 09:22, Russell King - ARM Linux admin
> > > <linux@...linux.org.uk> wrote:
> > > > The key file is /sys/kernel/debug/mv88e6xxx.0/regs - please send the
> > > > contents of that file.
> > > 
> > > $ cat regs.broken
> > >     GLOBAL GLOBAL2 SERDES     0    1    2    3    4    5
> > >  0:  c800       0    ffff  9e07 9e4f 100f 100f 9e4f 170b
> > >  1:     0    803e    ffff     3    3    3    3    3 201f
> >                                                       ^^^^
> > This is where the problem is.
> > 
> > >  1:     0    803e    ffff     3    3    3    3    3 203f
> >                                                       ^^^^
> > 
> > In the broken case, the link is forced down, in the working case, the
> > link is forced up.
> > 
> > What seems to be happening is:
> > 
> > dsa_port_link_register_of() gets called, and we do this:
> > 
> >                 phy_np = of_parse_phandle(dp->dn, "phy-handle", 0);
> >                 if (of_phy_is_fixed_link(dp->dn) || phy_np) {
> >                         if (ds->ops->phylink_mac_link_down)
> >                                 ds->ops->phylink_mac_link_down(ds, port,
> >                                         MLO_AN_FIXED, PHY_INTERFACE_MODE_NA);
> >                         return dsa_port_phylink_register(dp);
> > 
> > which forces the link down, and for some reason the link never comes
> > back up.
> >
> > One of the issues here is of_phy_is_fixed_link() - it is dangerous.
> > The function name leads you astray - it suggests that if it returns
> > true, then you have a fixed link, but it also returns true of you
> > have managed!="auto" in DT, so it's actually fixed-or-inband-link.
> > 
> > Andrew, any thoughts?
> 
> 
> Hi Russell
> 
> I think that is my change, if i remember correctly. Something to do
> with phylink assuming all interfaces are down to begin with. But DSA
> and CPU links were defaulting to up. When phylink later finds the
> fixed-link it then configures the interface up again, and because the
> interface is up, nothing actually happens, or it ends up in the wrong
> mode. So i think my intention was, if there is a fixed link in DT,
> down the interface before registering it with phylink, so its
> assumptions are true, and it will later be correctly configured up.
> 
> So in this case, do you think we are falling into the trap of
> managed!="auto" ?

Yes, it looks that way to me.  The DT description for the port is:

                        port@5 {
                                reg = <5>;
                                label = "cpu";
                                ethernet = <&cp1_eth2>;
                                phy-mode = "2500base-x";
                                managed = "in-band-status";
                        };

So, of_phy_is_fixed_link() will return true, but as far as phylink is
concerned, it's in in-band status mode rather than fixed-link mode.
Hmm.

Digging out the serdes PHY status on my GT8k (C45 address 21, PHYXS):

 2000  1140 0145 0141 0c00 00a0 0000 0004 2001
 2008  0000 0000 0000 0000 0000 0000 0000 8000
...
 a000  2000 0600 0000 a420 5100 2000 0000 0000

Remembering that it's a C22 register layout for this PHY starting at
0x2000, with the Marvell status register at 0xa003.

BMCR = 0x1140 = BMCR_ANENABLE | BMCR_FULLDPLX | BMCR_SPEED1000
BMSR = 0x0145 = BMSR_ESTATEN | reserved_bit(6) | BMSR_LSTATUS | BMSR_ERCAP
Status = 0xa420 = MV88E6390_SGMII_PHY_STATUS_SPEED_1000 |
		  MV88E6390_SGMII_PHY_STATUS_DUPLEX_FULL |
		  MV88E6390_SGMII_PHY_STATUS_LINK |
		  bit(5)

Note that MV88E6390_SGMII_PHY_STATUS_SPD_DPL_VALID is missing, so the
results of the speed, duplex and pause are not valid.  The only reason
the link is up is because we're forcing it up.

The other end of the link is not allowing the BASE-X configuration to
complete, which is not that surprising when you notice it is:

&cp1_eth2 {
        status = "okay";
        phy-mode = "2500base-x";
        phys = <&cp1_comphy5 2>;
        fixed-link {
                speed = <2500>;
                full-duplex;
        };
};

in fixed-link mode rather than in-band mode.

So, each end of the link has been configured differently in DT.  One
end has been told to use in-band AN, whereas the other end has been
told not to, which means when we start interpreting in-band correctly
in DSA, this dis-similar setup breaks.

Both ends really need to agree, and I'd suggest cp1_eth2 needs to drop
the fixed-link stanza and instead use ``managed = "in-band";'' to be
in agreement with the configuration at the switch.

Martin, can you modify
arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts to test
that please?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ