lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ykdqxnky7shebbhtucoiokbews2be5bml6raqafsfn4x6bp6h3@nqsn6akpajvp>
Date: Thu, 2 May 2024 12:43:27 -0500
From: Andrew Halaney <ahalaney@...hat.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-stm32@...md-mailman.stormreply.com, netdev@...r.kernel.org, alexandre.torgue@...s.st.com, 
	joabreu@...opsys.com, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, mcoquelin.stm32@...il.com, andrew@...n.ch, hkallweit1@...il.com
Subject: Re: racing ndo_open()/phylink*connect() with phy_probe()

On Tue, Apr 30, 2024 at 10:42:58PM +0100, Russell King (Oracle) wrote:
> On Tue, Apr 30, 2024 at 04:02:19PM -0500, Andrew Halaney wrote:
> > Basically, NetworkManager is setting both interfaces to up, and end1's
> > phy doesn't seem to be ready when ndo_open() runs, returning
> > -ENODEV in phylink_fwnode_phy_connect() and bubbling that back up. This doesn't
> 
> Let's get something clear - you're attributing phylink to this, but this
> is not the case. phylink doesn't deal directly with PHYs, it makes use
> of phylib for that, and merely passes back to its caller whatever status
> it gets from phylib. It's also not fair to attribute this to phylib as
> we will see later...

Sorry for the delay, I wanted to try and test with some extra logs in
the legit setup (not my "simulate via EPROBE_DEFER delays" approach)
which is tedious with the initramfs (plus I wasted time failing to
ftrace some stuff :P) to reconvince me of old notes. Thanks for the
explanation above on the nuances between phylink and phylib, I really
appreciate it.

> 
> There are a few reasons for phylink_fwnode_phy_connect() would return
> -ENODEV:
> 
> 1) fwnode_get_phy_node() (a phylib function) returning an error,
> basically meaning the phy node isn't found. This would be a persistent
> error, so unlikely to be your issue.
> 
> 2) fwnode_phy_find_device() (another phylib function) not finding the
> PHY device corresponding to the fwnode returned by the above on the
> MDIO bus. This is possible if the PHY has not been detected on the
> MDIO bus, but I suspect this is not the cause of your issue.

So I think we're in this case. I added some extra logs to see which
of the cases we were hitting, as well as some extra logs in phy creation
code etc to come to that conclusion:

    # end1 probe start (and finish)
    [    1.424099] qcom-ethqos 23000000.ethernet: Adding to iommu group 2
    ...
    [    1.431267] qcom-ethqos 23000000.ethernet: Using 40/40 bits DMA host/device width

    # end0 probe start
    [    1.440517] qcom-ethqos 23040000.ethernet: Adding to iommu group 3
    ...
    [    1.443502] qcom-ethqos 23040000.ethernet: Using 40/40 bits DMA host/device width

    # end0 starts making the mdio bus, and phy devices
    [    1.443537] qcom-ethqos 23040000.ethernet: Before of_mdiobus_reg

    # create phy at addr 0x8, end0's phy
    [    1.450118] Starting phy_create_device for addr: 8

    # NetworkManager up'ed end1! and again. But the device we're needing
    # (0xa) isn't created yet
    [    1.459743] qcom-ethqos 23000000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-0
    ...
    [    1.465168] Failed at fwnode_phy_find_device
    [    1.465174] qcom-ethqos 23000000.ethernet end1: __stmmac_open: Cannot attach to PHY (error: -19)
    [    1.473687] qcom-ethqos 23000000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-0
    ...
    [    1.477637] Failed at fwnode_phy_find_device
    [    1.477643] qcom-ethqos 23000000.ethernet end1: __stmmac_open: Cannot attach to PHY (error: -19)

    # device created for 0x8, probe it
    [    1.531617] Ending phy_create_device for addr: 8
    [    1.627462] Marvell 88E1510 stmmac-0:08: Starting probe
    [    1.627644] hwmon hwmon0: temp1_input not attached to any thermal zone
    [    1.627650] Marvell 88E1510 stmmac-0:08: Ending probe

    # device created for 0xa, probe it
    [    1.628992] Starting phy_create_device for addr: a
    [    1.632615] Ending phy_create_device for addr: a
    [    1.731552] Marvell 88E1510 stmmac-0:0a: Starting probe
    [    1.731732] hwmon hwmon1: temp1_input not attached to any thermal zone
    [    1.731738] Marvell 88E1510 stmmac-0:0a: Ending probe

    # end0 is done probing now
    [    1.732804] qcom-ethqos 23040000.ethernet: After of_mdiobus_reg
    [    1.820725] qcom-ethqos 23040000.ethernet end0: renamed from eth0

    # NetworkManager up's end0
    [    1.851805] qcom-ethqos 23040000.ethernet end0: Register MEM_TYPE_PAGE_POOL RxQ-0
    ...
    [    1.914980] qcom-ethqos 23040000.ethernet end0: PHY [stmmac-0:08] driver [Marvell 88E1510] (irq=233)
    ...
    [    1.939432] qcom-ethqos 23040000.ethernet end0: configuring for phy/sgmii link mode
    ...
    [    4.451765] qcom-ethqos 23040000.ethernet end0: Link is Up - 1Gbps/Full - flow control rx/tx

So end1 is up'ed before end0 can finish making its mdio bus / phy
devices, and therefore we fail to find it. I can easily simulate this
situation as well by -EPROBE_DEFER'ing end0 for say 10 seconds.


In playing around with this I also discovered that if end1's marvell
phy -EPROBE_DEFERs for a bit, up'ing end1 results in matching against the Generic PHY
driver, so then things don't work network wise. That's a similar topic, but
probably should be discussed separately?
Mentioning it now before I forget though. Here's some logs:

    # Probe end1
    [    8.245164] qcom-ethqos 23000000.ethernet: Adding to iommu group 8
    ...
    [    8.377010] qcom-ethqos 23000000.ethernet: Using 40/40 bits DMA host/device width

    # Probe end0
    [    8.396919] qcom-ethqos 23040000.ethernet: Adding to iommu group 9
    ...
    [    8.513481] qcom-ethqos 23040000.ethernet: Using 40/40 bits DMA host/device width
    [    8.521475] qcom-ethqos 23040000.ethernet: Before of_mdiobus_reg
    [    8.529283] Starting phy_create_device for addr: 8
    [    8.553872] Ending phy_create_device for addr: 8
    [    8.714637] Marvell 88E1510 stmmac-0:08: Ending probe
    [    8.721627] Starting phy_create_device for addr: a
    [    8.729064] Ending phy_create_device for addr: a
    [    8.898759] qcom-ethqos 23040000.ethernet: After of_mdiobus_reg
    ...

    # NetworkManager ups end0
    [    9.028419] qcom-ethqos 23040000.ethernet end0: Register MEM_TYPE_PAGE_POOL RxQ-0
    ...
    [    9.092839] net end0: Before phylink_fwnode_phy_connect
    [    9.164375] qcom-ethqos 23040000.ethernet end0: PHY [stmmac-0:08] driver [Marvell 88E1510] (irq=280)
    [    9.174201] net end0: After phylink_fwnode_phy_connect
    ...

    # NetworkManager ups end1, get the Generic PHY instead of marvell...
    [    9.257364] qcom-ethqos 23040000.ethernet end0: configuring for phy/sgmii link mode
    ...
    [    9.317542] net end1: Before phylink_fwnode_phy_connect
    [    9.404384] qcom-ethqos 23000000.ethernet end1: PHY [stmmac-0:0a] driver [Generic PHY] (irq=POLL)
    [    9.414730] net end1: After phylink_fwnode_phy_connect
    ...
    [    9.509450] qcom-ethqos 23000000.ethernet end1: configuring for phy/sgmii link mode

    # end0 comes up, end1 doesn't due to the wrong phy being selected here
    [   11.672223] qcom-ethqos 23040000.ethernet end0: Link is Up - 1Gbps/Full - flow control rx/tx

> 
> 3) phy_attach_direct() (another phylib function) returning an error.
> This function calls phy_init_hw() which will attempt to talk to the
> hardware, and if that returns an error, it will be propagated up.
> 
> (3) is the most likely scenario given your quoted DT description. I
> suspect that the stmmac/qcom-ethqos driver is what's at fault here.
> 
> Your DT description shows that the PHYs are on one MDIO bus owned by
> one of the network interfaces. I suspect if that network interface
> is down, then the MDIO bus is not accessible.
> 
> Therefore, when you attempt to open the _other_ network interface,
> accesses to its PHY fail with -ENODEV and that gets propagated all
> the way back up.
> 
> What's more is if you did manage to get that network interface up
> (because the one with the MDIO bus on was up) then if you take
> that interface down, you'll end up with a phy_error() splat from
> phylib because the PHY it was using has become inaccessible.
> 
> Basically, the network driver is buggy for this PHY setup. If a
> MDIO bus contains devices that are not owned by the network device
> owning that MDIO bus, then the MDIO bus _must_ be prepared to handle
> MDIO bus accesses _at_ _any_ _time_. This clearly is not the case
> here.

As far as I can tell, at least from the "link up/down" perspective, any
combo works. If I boot (without NetworkManager doing things), I can
play around with any combo of link up and down without any noticeable
issue.

> 
> It could also be the case that if the driver is using runtime PM,
> that when the network interface is runtime-PM suspended, it causes
> MDIO bus accesses to fail... that would be very chaotic though.
> 
> In any case, I'm going to say... I don't think this is a phylink nor
> phylib issue, but a buggy network driver thinking that it has the
> right to shutdown MDIO bus access depending on its network interface
> state.
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ