lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx87mF=_90B_30tgTkMcFDPX6Lk2FRfA1d-G+x=Wuw2FLA@mail.gmail.com>
Date:   Mon, 23 Aug 2021 14:23:31 -0700
From:   Saravana Kannan <saravanak@...gle.com>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Alvin Šipraga <ALSI@...g-olufsen.dk>,
        Vladimir Oltean <olteanv@...il.com>,
        Vladimir Oltean <vladimir.oltean@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Frank Rowand <frowand.list@...il.com>,
        Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH net] net: dsa: sja1105: fix use-after-free after calling
 of_find_compatible_node, or worse

On Mon, Aug 23, 2021 at 1:44 PM Andrew Lunn <andrew@...n.ch> wrote:
>
> > I thought about this in the background the past few days. I think
> > there are a couple of options:
> > 1. We (community/Andrew) agree that this driver would only work with
> > fw_devlink=on and we can confirm that the other upstream uses of
> > "realtek,rtl8366rb" won't have any unprobed consumers problem and
> > switch to using my patch. Benefit is that it's a trivial and quick
> > change that gets things working again.
>
> I don't think realtek,rtl8366rb is doing anything particularly
> unusual. It is not the only switch driver with an MDIO bus driver with
> its internal PHYs on it.

But realtek,rtl8366rb is doing some unusual things though:
1. Expecting its child devices would get probed as soon as they are
added in its own probe().
2. The child device in turn depends on the parent node to have probed
(the interrupt dependency in this case).

There is absolutely no guarantee from the driver core that (1) would
happen. In fact, it's more likely for this to not happen.
(2) isn't fully wrong, but it's kinda weird when combined with (1)
because it causes a quasi cyclic dependency.

>
> > 2. The "realtek,rtl8366rb" driver needs to be fixed to use a
> > "component device".
>
> Again, i don't think "realtek,rtl8366rb is doing anything unusual,
> compared to the other DSA drivers. If you are suggesting it needs to
> make use of the component driver, you might also be suggesting that
> all the switch drivers need to be component devices.

I'm saying any driver that does both (1) and (2) above needs to start
using the component devices. Otherwise, it is using a broken driver
model AND it'll be caught/enforced by fw_devlink=on. Even just doing
(1) is wrong, but fw_devlink=on doesn't catch/enforce it.

> I don't fully
> understand the details here, but it might be, you are also suggesting
> some Ethernet drivers need modifying to use the component framework?

I'm not sure if I'm saying that and more likely I'm NOT saying that
(see further below). Based on link [1] I think it's actually the
opposite of point (1) above. IIUC, in the case of FEC, you don't want
the child to probe before the parent is done (because the child
depends on the parent) and fw_devlink=on actually enforces that.

What I'm pointing out is that since dsa_register_switch() is assuming
the PHYs are ready/attaches to PHYs when it's called, if ALL the
following conditions are met, a DSA switch driver needs to use a
component driver model:
1. Switches that have child PHYs that dsa_register_switch() will
attach to when called.
2. The child PHYs depend on services provided by the parent (eg:
interrupt controller)

I don't know much about dsa_register_switch(), but if the DSA
framework could NOT attempt to attach to PHYs as soon as the switch is
registered, that might solve the problem too without needing to use
the component driver model as that'll break the cyclic dependency.

> And that is not going to fly.
>
> This has all worked until now, things might need a few iterations with
> deferral, but we get there in the end.

A few deferred probes is totally fine and fw_devlink doesn't have any
issues with that. In fact, fw_devlink would cut down on some
unnecessary deferred probes [1]. I think you are mixing two different
cases. This case is not the same as [1]. In the case of this realtek
switch driver, it's saying the deferred probe of its child PHYs is NOT
allowed. And I'm saying that's not a valid assumption and the
component device model seems like one good way to handle this
situation.

> Maybe we need to back out the
> phy-handle patch? It does appear to be causing regressions.

I don't mind if you want to do that (fixing the issue Marek reported
is easy). But that doesn't mean this realtek driver isn't wrong and
needs to be fixed. It's just that fw_devlink=on is catching this more
clearly.

[1] - https://lore.kernel.org/netdev/CAGETcx9=AyEfjX_-adgRuX=8a0MkLnj8sy2KJGhxpNCinJu4yA@mail.gmail.com/

-Saravana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ