lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Sep 2021 15:35:32 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     "Russell King (Oracle)" <linux@...linux.org.uk>
Cc:     Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        linux-kernel@...r.kernel.org,
        Linus Walleij <linus.walleij@...aro.org>,
        Alvin Šipraga <alsi@...g-olufsen.dk>,
        ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
        kernel-team <kernel-team@...roid.com>,
        Len Brown <lenb@...nel.org>
Subject: Re: [RFC PATCH net-next 0/3] Make the PHY library stop being so
 greedy when binding the generic PHY driver

On Thu, Sep 02, 2021 at 01:19:27PM +0100, Russell King (Oracle) wrote:
> On Thu, Sep 02, 2021 at 01:50:50AM +0300, Vladimir Oltean wrote:
> > The central point of that discussion is that DSA seems "broken" for
> > expecting the PHY driver to probe immediately on PHYs belonging to the
> > internal MDIO buses of switches. A few suggestions were made about what
> > to do, but some were not satisfactory and some did not solve the problem.
> 
> I think you need to describe the mechanism here. Why wouldn't a PHY
> belonging to an internal MDIO bus of a switch not probe immediately?
> What resources may not be available?

As you point out below, the interrupt-controller is what is not available.
There is a mechanism called fw_devlink which infers links from one OF
node to another based on phandles. When you have an interrupt-parent,
that OF node becomes a supplier to you. Those OF node links are then
transferred to device links once the devices having those OF nodes are
created.

> If we have a DSA driver that tries to probe the PHYs before e.g. the
> interrupt controller inside the DSA switch has been configured, aren't
> we just making completely unnecessary problems for ourselves?

This is not what happens, if that were the case, of course I would fix
_that_ and not in this way.

> Wouldn't it be saner to ensure that the interrupt controller has been
> setup and become available prior to attempting to setup anything that
> relies upon that interrupt controller?

The interrupt controller _has_ been set up. The trouble is that the
interrupt controller has the same OF node as the switch itself, and the
same OF node. Therefore, fw_devlink waits for the _entire_ switch to
finish probing, it doesn't have insight into the fact that the
dependency is just on the interrupt controller.

> From what I see of Marvell switches, the internal PHYs only ever rely
> on internal resources of the switch they are embedded in.
> 
> External PHYs to the switch are a different matter - these can rely on
> external clocks, and in that scenario, it would make sense for a
> deferred probe to cause the entire switch to defer, since we don't
> have all the resources for the switch to be functional (and, because we
> want the PHYs to be present at switch probe time, not when we try to
> bring up the interface, I don't see there's much other choice.)
> 
> Trying to move that to interface-up time /will/ break userspace - for
> example, Debian's interfaces(8) bridge support will become unreliable,
> and probably a whole host of other userspace. It will cause regressions
> and instability to userspace. So that's a big no.

Why a big no? I expect there to be 2 call paths of phy_attach_direct:
- At probe time. Both the MAC driver and the PHY driver are probing.
  This is what has this patch addresses. There is no issue to return
  -EPROBE_DEFER at that time, since drivers connect to the PHY before
  they register their netdev. So if connecting defers, there is no
  netdev to unregister, and user space knows nothing of this.
- At .ndo_open time. This is where it maybe gets interesting, but not to
  user space. If you open a netdev and it connects to the PHY then, I
  wouldn't expect the PHY to be undergoing a probing process, all of
  that should have been settled by then, should it not? Where it might
  get interesting is with NFS root, and I admit I haven't tested that.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ