netdev - Re: lan78xx: /sys/class/net/eth0/carrier stuck at 1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201106095611.7e5912a0@gollum>
Date:   Fri, 6 Nov 2020 09:56:11 +0100
From:   Juerg Haefliger <juerg.haefliger@...onical.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Juerg Haefliger <juerg.haefliger@...onical.com>,
        netdev@...r.kernel.org, woojung.huh@...rochip.com
Subject: Re: lan78xx: /sys/class/net/eth0/carrier stuck at 1

On Tue, 3 Nov 2020 16:27:23 +0100
Andrew Lunn <andrew@...n.ch> wrote:

> On Tue, Nov 03, 2020 at 01:47:12PM +0100, Juerg Haefliger wrote:
> > On Fri, 23 Oct 2020 15:05:19 +0200
> > Andrew Lunn <andrew@...n.ch> wrote:
> >   
> > > On Fri, Oct 23, 2020 at 08:29:59AM +0200, Juerg Haefliger wrote:  
> > > > On Wed, 21 Oct 2020 21:35:48 +0200
> > > > Andrew Lunn <andrew@...n.ch> wrote:
> > > >     
> > > > > On Wed, Oct 21, 2020 at 05:00:53PM +0200, Juerg Haefliger wrote:    
> > > > > > Hi,
> > > > > > 
> > > > > > If the lan78xx driver is compiled into the kernel and the network cable is
> > > > > > plugged in at boot, /sys/class/net/eth0/carrier is stuck at 1 and doesn't
> > > > > > toggle if the cable is unplugged and replugged.
> > > > > > 
> > > > > > If the network cable is *not* plugged in at boot, all seems to work fine.
> > > > > > I.e., post-boot cable plugs and unplugs toggle the carrier flag.
> > > > > > 
> > > > > > Also, everything seems to work fine if the driver is compiled as a module.
> > > > > > 
> > > > > > There's an older ticket for the raspi kernel [1] but I've just tested this
> > > > > > with a 5.8 kernel on a Pi 3B+ and still see that behavior.      
> > > > > 
> > > > > Hi Jürg    
> > > > 
> > > > Hi Andrew,
> > > > 
> > > >     
> > > > > Could you check if a different PHY driver is being used when it is
> > > > > built and broken vs module or built in and working.
> > > > > 
> > > > > Look at /sys/class/net/eth0/phydev/driver    
> > > > 
> > > > There's no such file.    
> > > 
> > > I _think_ that means it is using genphy, the generic PHY driver, not a
> > > specific vendor PHY driver? What does
> > > 
> > > /sys/class/net/eth0/phydev/phy_id contain.  
> > 
> > There is no directory /sys/class/net/eth0/phydev.  
> 
> [Goes and looks at the code]
> 
> The symbolic link is only created if the PHY is connected to the MAC
> if the MAC has been registered with the core first. lan78xx does it
> the other way around:
> 
>         ret = lan78xx_phy_init(dev);
>         if (ret < 0)
>                 goto out4;
> 
>         ret = register_netdev(netdev);
>         if (ret != 0) {
>                 netif_err(dev, probe, netdev, "couldn't register the device\n");
>                 goto out5;
>         }
> 
> The register dump you show below indicates an ID of 007c132, which
> fits the drivers drivers/net/phy/microchip.c : "Microchip
> LAN88xx". Any mention of that in dmesg, do you see the module loaded?

It's built-in but is being used (as tracing shows).

 
> >   
> > > > Given that all works fine as long as the cable is unplugged at boot points
> > > > more towards a race at boot or incorrect initialization sequence or something.    
> > > 
> > > Could be. Could you run
> > > 
> > > mii-tool -vv eth0  
> > 
> > Hrm. Running that command unlocks the carrier flag and it starts toggling on
> > cable unplug/plug. First invocation:
> > 
> > $ sudo mii-tool -vv eth0
> > Using SIOCGMIIPHY=0x8947
> > eth0: negotiated 1000baseT-FD flow-control, link ok
> >   registers for MII PHY 1: 
> >     1040 79ed 0007 c132 05e1 cde1 000f 0000
> >     0000 0200 0800 0000 0000 0000 0000 3000
> >     0000 0000 0088 0000 0000 0000 3200 0004
> >     0040 a000 a000 0000 a035 0000 0000 0000
> >   product info: vendor 00:01:f0, model 19 rev 2
> >   basic mode:   autonegotiation enabled
> >   basic status: autonegotiation complete, link ok
> >   capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
> >   advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
> >   link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
> > 
> > Subsequent invocation:
> > 
> > $ sudo mii-tool -vv eth0
> > Using SIOCGMIIPHY=0x8947
> > eth0: negotiated 1000baseT-FD flow-control, link ok
> >   registers for MII PHY 1: 
> >     1040 79ed 0007 c132 05e1 cde1 000d 0000
> >     0000 0200 0800 0000 0000 0000 0000 3000
> >     0000 0000 0088 0000 0000 0000 3200 0004
> >     0040 a000 0000 0000 a035 0000 0000 0000
> >   product info: vendor 00:01:f0, model 19 rev 2
> >   basic mode:   autonegotiation enabled
> >   basic status: autonegotiation complete, link ok
> >   capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
> >   advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
> >   link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
> > 
> > In the first invocation, register 0x1a shows a pending link-change interrupt
> > (0xa000) which wasn't serviced (and cleared) for some reason. Dumping the
> > registers cleared that interrupt bit and things start working correctly
> > afterwards. Nor sure yet why that first interrupt is ignored.  
> 
> So, 0x1a is the interrupt status, and 0x19 is the interrupt mask.

Yes.


> This should really be interpreted as a level interrupt. But it appears
> the hardware the interrupt is connected to is actually doing edge. And
> the edge has been missed, and so the interrupt is never serviced.
> 
> I think the call sequence goes something like this, if i'm reading the
> code correct:
> 
> lan78xx_probe() calls lan78xx_bind:
> 
>     lan78xx_bind() registers an interrupt domain. This allows USB
>     status messages indicating an interrupt to be dispatched using the
>     normal interrupt mechanism, despite there not being a proper
>     interrupt as such. The masking of interrupts seems to be part of
>     INT_EP_CTL. This register is read during
>     lan78xx_setup_irq_domain(), but it is not written to disable all
>     interrupts. So it could be, PHY interrupts in the interrupt
>     controller are already enabled at this point.
> 
>     lan78xx_bind() also registers an MDIO bus. This will cause the bus
>     to be probed and the PHY should be found. 
> 
> lan78xx_probe() calls lan78xx_phy_init:
> 
>     lan78xx_phy_init gets the phydev, and fills in the interrupt
>     number. It then connects the PHY to the MAC using
>     phy_connect_direct(). phy_connect_direct() will request the
>     interrupt from the kernel, meaning it can then service interrupts,
>     and as part of that, it calls into the interrupt domain and
>     enables interrupts in the interrupt controller.  It clears the
>     interrupts in the PHY, which means the PHY interrupt status
>     register is read, clearing it. It then enables interrupts in the
>     PHY by again reading the status register to clear any pending
>     interrupts and then sets the mask to enable interrupts.
>     
> So at this point, the PHY is ready to generate interrupts, the
> interrupt controller in the USB device is ready to accept them. The
> kernel itself is read for them, and they should be passed to the PHY
> subsystem. What is not clear to me is if the USB endpoint is correctly
> setup to report interrupts.
> 
>     Interestingly, the last thing lan78xx_phy_init() does is call
>     genphy_config_aneg(). That configures the PHY to start an
>     auto-neg. That is wrong, on a number of levels. The PHY drivers
>     implementation, lan88xx_config_aneg() sets the mdix before
>     starting autoneg. That gets skipped by directly calling
>     genphy_config_aneg(). However, the interface is not even
>     registered yet, let alone up. It has no business starting an
>     auto-neg. But if the cable is connected and the peer is ready,
>     about 1.5 seconds later, we expect auto-neg to complete and the
>     interrupt to fire.
> 
> Sometime later, lan78xx_open() is called when the interface is
> configured up. It calls phy_start() which is the correct way to get
> the PHY going. That will call into the PHY driver to start auto-neg.
> I've no idea what happens in this PHY when two auto-negs are going.
> 
> What lan78xx_open() also does is trigger a delayed work for
> EVENT_LINK_RESET. lan78xx_link_reset() gets called as a result, and
> that writes to the interrupt status register to clear it. Why clear
> interrupts when we have just started auto-neg and we expect it to
> cause an interrupt? The whole of lan78xx_link_reset() looks odd, and
> some of it should be moved to lan78xx_link_status_change() which
> phylib will call when the PHY changes status.
> 
> O.K. As a start, remove the  in  from
> lan78xx_phy_init(). I don't know if that is enough, but it is clearly
> wrong.

The interface won't come up with that change. FWIW I recorded some traces
(functions phy_*, lan88xx_*, lan78xx_*):
[1] lan78xx built-in.
[2] lan78xx built-in without genphy_config_aneg() in lan78xx_phy_init().
[3] lan78xx as a module.

I've also added some trace_printks that dump the content of the status and
mask registers at the end of lan78xx_probe() (no interrupt pending) and in 
lan78xx_get_drvinfo() which seems to be the next lan78xx function being
called after probing (per the built-in trace). At that time an interrupt
seems to be pending (lan78xx_get_drvinfo LAN88XX_INT_STS = 0xa000).

...Juerg

[1] https://kernel.ubuntu.com/~juergh/lp1890487/trace.builtin
[2] https://kernel.ubuntu.com/~juergh/lp1890487/trace.builtin-no-aneg
[3] https://kernel.ubuntu.com/~juergh/lp1890487/trace.module

 
>     Andrew


Content of type "application/pgp-signature" skipped