lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 12 Dec 2023 00:47:57 -0300
From: Luiz Angelo Daros de Luca <luizluca@...il.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: Alvin Šipraga <ALSI@...g-olufsen.dk>, 
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	"linus.walleij@...aro.org" <linus.walleij@...aro.org>, "andrew@...n.ch" <andrew@...n.ch>, 
	"f.fainelli@...il.com" <f.fainelli@...il.com>, "davem@...emloft.net" <davem@...emloft.net>, 
	"edumazet@...gle.com" <edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>, 
	"pabeni@...hat.com" <pabeni@...hat.com>, "arinc.unal@...nc9.com" <arinc.unal@...nc9.com>
Subject: Re: [PATCH net-next 2/7] net: dsa: realtek: put of node after MDIO registration

> > Reviewing the code again, I believe it was not just misplacing the
> > of_put_node() but probably calling it twice.
> >
> > devm_mdiobus_alloc() doesn't set the dev in mii_bus. So, dev is all
> > zeros. The dev.of_node normal place to be defined is:
> >
> > devm_of_mdiobus_register()
> >   __devm_of_mdiobus_register()
> >     __of_mdiobus_register()
> >       device_set_node()
> >
> > The only way for that value, set by the line I removed, to persist is
> > when the devm_of_mdiobus_register() fails before device_set_node().
>
> Did you consider that __of_mdiobus_register() -> device_set_node() is
> actually overwriting priv->user_mii_bus->dev.of_node with the same value?
> So the reference to mdio_np persists even if technically overwritten.
> The fact that the assignment looks redundant is another story.

Yes, I believe it is just redundant.

> > My guess is that it was set to be used by realtek_smi_remove() if it
> > is called when registration fails. However, in that case, both
> > realtek_smi_setup_mdio() and realtek_smi_setup_mdio() would put the
>
> You listed the same function name twice. You meant realtek_smi_remove()
> the second time?

yes

> > node. So, either the line is useless or it will effectively result in
> > calling of_node_put() twice.
>
> False logic, since realtek_smi_remove() is not called when probe() fails.
> ds->ops->setup() is called from probe() context. So no double of_node_put().
> That's a general rule with the kernel API. When a setup API function fails,
> it is responsible of cleaning up the temporary things it did. The
> teardown API function is only called when the setup was performed fully.

So, "the line is useless".

> (the only exception I'm aware of is the Qdisc API, but that's not
> exactly the best model to follow)
>
> > If I really needed to put that node in the realtek_smi_remove(), I
> > would use a dedicated field in realtek_priv instead of reusing a
> > reference for it inside another structure.
> >
> > I'll add some notes to the commit message about all these but moving
> > the of_node_put() to the same function that gets the node solved all
> > the issues.
>
> "Solved all the issues" - what are those issues, first of all?

1) the useless assignment
2) the (possible) double of_node_put(), which proved to be false.

> The simple fact is: of_get_compatible_child() returns an OF node with an
> elevated refcount. It passes it to of_mdiobus_register() which does not
> take ownership of it per se, but assigns it to bus->dev.of_node, which
> is accessible until device_del() from mdiobus_unregister().

Normally, when you have a refcount system, whenever you have a
reference to an object, you should increase the refcount. I thought
that every time you assign a kobject to a structure, you should get it
as well (and put it when you deallocate it). But that's just what I
would expect, not something I found in docs.

I see distinct behaviors with methods that assign the dev.of_node
using device_set_node() in OF MDIO API code and that's not good:

static int of_mdiobus_register_device(struct mii_bus *mdio,
                                     struct device_node *child, u32 addr)
{
       (...)
       fwnode_handle_get(fwnode);
       device_set_node(&mdiodev->dev, fwnode);
       (...)
}

int fwnode_mdiobus_phy_device_register(struct mii_bus *mdio,
                                      struct phy_device *phy,
                                      struct fwnode_handle *child, u32 addr)
{
       (...)
       fwnode_handle_get(child);
       device_set_node(&phy->mdio.dev, child);
       (...)
}

int of_mdiobus_phy_device_register(struct mii_bus *mdio, struct phy_device *phy,
                                  struct device_node *child, u32 addr)
{
       return fwnode_mdiobus_phy_device_register(mdio, phy,
                                                 of_fwnode_handle(child),
}

int __of_mdiobus_register(struct mii_bus *mdio, struct device_node *np,
                         struct module *owner)
{
       (...)
       device_set_node(&mdio->dev, of_fwnode_handle(np));
       (...)
       for_each_available_child_of_node(np, child) {
                (...)
                if (of_mdiobus_child_is_phy(child))
                       rc = of_mdiobus_register_phy(mdio, child, addr);
               else
                       rc = of_mdiobus_register_device(mdio, child, addr);
       }
})

Each deals differently with the device_node it receives. Both
of_mdiobus_register_phy and of_mdiobus_register_device gets the child
before assigning it to the device but not __of_mdiobus_register. Why?

After some more study, I think it is just because, while an
of_node_get() just before device_set_node() fits nicely in
__of_mdiobus_register(), there is not a good place in of_mdio to put
it. We don't have an of_mdiobus_unregister(). The unregistration
happens only in mdiobus_unregister(), where, I guess, it should avoid
OF-specific code. Even if we put OF code there, we would need to know
during mdiobus_unregister() if the bus->dev.of_node was gotten by
of_mdio or someone else. I believe it is not nice to externally assign
dev.of_node directly to mdiobus but realtek switch driver is doing
just that and others might be doing the same thing.

The delegation of of_node_get/put to the caller seems to be just an
easy workaround the fact that there is no good place to put a node
that of_mdio would get. For devm functions, we could include the
get/put call creating a new devm_of_mdiobus_unregister() but I believe
devm and non-devm needs to be equivalent (except for the resource
deallocation).

> The PHY library does not make the ownership rules of the of_node very
> clear, but since it takes no reference on it, it will fail in subtle
> ways if you pull the carpet from under its feet.
>
> For example, I expect of_mdio_find_bus() to fail. That is used only
> rarely, like by the MDIO mux driver which I suppose you haven't tested.

No, I didn't test it. In fact, most embedded devices will not use
dynamic OF and all those of_node_get/put will be a noop.

> If you want, you could make the OF MDIO API get() and put() the reference,
> instead of using something it doesn't fully own. But currently the code
> doesn't do that. Try to acknowledge what exists, first.

What I saw in other drivers outside drivers/net is that one that
allocates the dev will get the node before assigning dev.of_node and
put it before freeing the device. The mdiobus case seems to be
different. I believe it would make the code more robust if we could
fix that inside OF MDIO API and not just document its behavior. It
will also not break existing uses as extra get/put's are OK.

I believe we could add an unregister callback to mii_bus. It wouldn't
be too complex:

>From b5b059ea4491e9f745872220fb94d8105e2d7d43 Mon Sep 17 00:00:00 2001
From: Luiz Angelo Daros de Luca <luizluca@...il.com>
Date: Tue, 12 Dec 2023 00:26:06 -0300
Subject: [PATCH] net: mdio: get/put device node during (un)registration

__of_mdiobus_register() was storing the device node in dev.of_node
without increasing its refcount. It was implicitly delegating to the
caller to maintain the node allocated until mdiobus was unregistered.

Now, the __of_mdiobus_register() will get the node before assigning it,
and of_mdiobus_unregister_callback() will be called at the end of
mdio_unregister().

Drivers can now put the node just after the MDIO registration.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@...il.com>
---
drivers/net/mdio/of_mdio.c | 12 +++++++++++-
drivers/net/phy/mdio_bus.c |  3 +++
include/linux/phy.h        |  3 +++
3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mdio/of_mdio.c b/drivers/net/mdio/of_mdio.c
index 64ebcb6d235c..9b6cab6154e0 100644
--- a/drivers/net/mdio/of_mdio.c
+++ b/drivers/net/mdio/of_mdio.c
@@ -139,6 +139,11 @@ bool of_mdiobus_child_is_phy(struct device_node *child)
}
EXPORT_SYMBOL(of_mdiobus_child_is_phy);

+static void __of_mdiobus_unregister_callback(struct mii_bus *mdio)
+{
+       of_node_put(mdio->dev.of_node);
+}
+
/**
 * __of_mdiobus_register - Register mii_bus and create PHYs from the device tree
 * @mdio: pointer to mii_bus structure
@@ -166,6 +171,8 @@ int __of_mdiobus_register(struct mii_bus *mdio,
struct device_node *np,
        * the device tree are populated after the bus has been registered */
       mdio->phy_mask = ~0;

+       mdio->__unregister_callback = __of_mdiobus_unregister_callback;
+       of_node_get(np);
       device_set_node(&mdio->dev, of_fwnode_handle(np));

       /* Get bus level PHY reset GPIO details */
@@ -177,7 +184,7 @@ int __of_mdiobus_register(struct mii_bus *mdio,
struct device_node *np,
       /* Register the MDIO bus */
       rc = __mdiobus_register(mdio, owner);
       if (rc)
-               return rc;
+               goto put_node;

       /* Loop over the child nodes and register a phy_device for each phy */
       for_each_available_child_of_node(np, child) {
@@ -237,6 +244,9 @@ int __of_mdiobus_register(struct mii_bus *mdio,
struct device_node *np,
unregister:
       of_node_put(child);
       mdiobus_unregister(mdio);
+
+put_node:
+       of_node_put(np);
       return rc;
}
EXPORT_SYMBOL(__of_mdiobus_register);
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 25dcaa49ab8b..1229b8e4c53b 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -787,6 +787,9 @@ void mdiobus_unregister(struct mii_bus *bus)
               gpiod_set_value_cansleep(bus->reset_gpiod, 1);

       device_del(&bus->dev);
+
+       if (bus->__unregister_callback)
+               bus->__unregister_callback(bus);
}
EXPORT_SYMBOL(mdiobus_unregister);

diff --git a/include/linux/phy.h b/include/linux/phy.h
index e5f1f41e399c..2b383da4d825 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -433,6 +433,9 @@ struct mii_bus {

       /** @shared: shared state across different PHYs */
       struct phy_package_shared *shared[PHY_MAX_ADDR];
+
+       /** @__unregister_callback: called at the last step of unregistration */
+       void (*__unregister_callback)(struct mii_bus *bus);
};
#define to_mii_bus(d) container_of(d, struct mii_bus, dev)

--
2.43.0

If we don't fix that in OF MDIO API, we would need to fix
fe7324b932222 as well, moving the put to the dsa_switch_teardown().

Regards,

Luiz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ