lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210901013830.yst73ubhsrlml54i@skbuf>
Date:   Wed, 1 Sep 2021 04:38:30 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Saravana Kannan <saravanak@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Len Brown <lenb@...nel.org>,
        Alvin Sipraga <ALSI@...g-olufsen.dk>, kernel-team@...roid.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        linux-acpi@...r.kernel.org
Subject: Re: [PATCH v1 1/2] driver core: fw_devlink: Add support for
 FWNODE_FLAG_BROKEN_PARENT

On Wed, Sep 01, 2021 at 04:28:26AM +0300, Vladimir Oltean wrote:
> On Wed, Sep 01, 2021 at 02:18:04AM +0300, Vladimir Oltean wrote:
> > On Wed, Sep 01, 2021 at 01:02:09AM +0200, Andrew Lunn wrote:
> > > Rev B is interesting because switch0 and switch1 got genphy, while
> > > switch2 got the correct Marvell PHY driver. switch2 PHYs don't have
> > > interrupt properties, so don't loop back to their parent device.
> > 
> > This is interesting and not what I really expected to happen. It goes to
> > show that we really need more time to understand all the subtleties of
> > device dependencies before jumping on patching stuff.
> 
> There is an even more interesting variation which I would like to point
> out. It seems like a very odd loophole in the device links.
> 
> Take the example of the mv88e6xxx DSA driver. On my board
> (arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts), even after I
> had to declare the switches as interrupt controller and add interrupts
> to their internal PHYs, I still need considerable force to 'break' this
> board in the way discussed in this thread. The correct PHY driver insists
> to probe, and not genphy. Let me explain.
> 
> The automatic device links between the switch (supplier, as interrupt-controller)
> and PHYs (consumers) are added by fwnode_link_add, called from of_link_to_phandle.
> 
> Important note: fwnode_link_add does not link devices, it links OF nodes.
> 
> Even more important node, in the form of a comment:
> 
>  * The driver core will use the fwnode link to create a device link between the
>  * two device objects corresponding to @con and @sup when they are created. The
>  * driver core will automatically delete the fwnode link between @con and @sup
>  * after doing that.
> 
> Okay?!
> 
> What seems to be omitted is that the DSA switch driver's probing itself
> can be deferred. For example:
> 
> dsa_register_switch
> -> dsa_switch_probe
>    -> dsa_switch_parse_of
>       -> dsa_switch_parse_ports_of
>          -> dsa_port_parse_of
>             -> of_find_net_device_by_node(of_parse_phandle(dn, "ethernet", 0));
>             -> not found => return -EPROBE_DEFER
> 
> When dsa_register_switch() returns -EPROBE_DEFER, it is effectively
> an error path. So the reverse of initialization is performed.
> 
> The mv88e6xxx driver calls mv88e6xxx_mdios_register() right _before_
> dsa_register_switch. So when dsa_register_switch returns error code,
> mv88e6xxx_mdios_unregister() will be called.
> 
> When mv88e6xxx_mdios_unregister() is called, the MDIO buses with
> internal PHYs are destroyed. So the PHY devices themselves are destroyed
> too. And the device links between the DSA switch and the internal PHYs,
> those created based on the firmware node links created by fwnode_link_add,
> are dropped too.
> 
> Now remember the comment that the device links created based on
> fwnode_link_add are not restored.
> 
> So probing of the DSA switch finally resumes, and this time
> device_links_check_suppliers() is effectively bypassed, the PHYs no
> longer request probe deferral due to their supplier not being ready,
> because the device link no longer exists.
> 
> Isn't this self-sabotaging?!
> 
> Now generally, DSA drivers defer probing because they probe in parallel
> with the DSA master. This is typical if the switch is on a SPI bus, or
> I2C, or on an MDIO bus provided by a _standalone_ MDIO controller.
> 
> If the MDIO controller is not standalone, but is provided by Ethernet
> controller that is the DSA master itself, then things change a lot,
> because probing can never be parallel. The DSA master probes,
> initializes its MDIO bus, and this triggers the probing of the MDIO
> devices on that bus, one of which is the DSA switch. So DSA can no
> longer defer the probe due to that reason.
> 
> Secondly, in DSA we even have variation between drivers as to where they
> register their internal MDIO buses. The mv88e6xxx driver does this in
> mv88e6xxx_probe (the probe function on the MDIO bus). The rtl8366rb
> driver calls realtek_smi_setup_mdio() from rtl8366rb_setup(), and this
> is important. DSA provides drivers with a .setup() callback, which is
> guaranteed to take place after nothing can defer the switch's probe
> anymore.
> 
> So putting two and two together, sure enough, if I move mv88e6xxx_mdios_register
> from mv88e6xxx_probe to mv88e6xxx_setup, then I can reliably break this
> setup, because the device links framework isn't sabotaging itself anymore.
> 
> Conversely, I am pretty sure that if rtl8366rb was to call of_mdiobus_register()
> from the probe method and not the setup method, the entire design issue
> with interrupts on internal DSA switch ports would have went absolutely
> unnoticed for a few more years.
> 
> I have not tested this, but it also seems plausible that DSA can
> trivially and reliably bypass any fw_devlink=on restrictions by simply
> moving all of_mdiobus_register() driver calls from the .setup() method
> to their respective probe methods (prior to calling dsa_register_switch),
> then effectively fabricate an -EPROBE_DEFER during the first probe attempt.
> I mean, who will know whether that probe deferral request was justified
> or not?

Pushing the thought even further, it is not even necessary to move the
of_mdiobus_register() call to the probe function. Where it is (in .setup)
is already good enough. It is sufficient to return -EOPNOTSUPP once
(the first time) immediately _after_ the call to of_mdiobus_register
(and have a proper error path, i.e. call mdiobus_unregister too).

> Anyway, I'm not sure everyone agrees with this type of "solution" (even
> though it's worth pointing it out as a fw_devlink limitation). In any
> case, we need some sort of lightweight "fix" to the chicken-and-egg
> problem, which will give me enough time to think of something better.
> I hope it is at least clearer now that there are subtleties and nuances,
> and we cannot just assess how many boards are broken by looking at the
> device trees. By design, all are, sure, but they might still work, and
> that's better than nothing...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ