[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250207085730.GD24886@pendragon.ideasonboard.com>
Date: Fri, 7 Feb 2025 10:57:30 +0200
From: Laurent Pinchart <laurent.pinchart@...asonboard.com>
To: Oleksij Rempel <o.rempel@...gutronix.de>
Cc: Zhang Zekun <zhangzekun11@...wei.com>, robh@...nel.org,
saravanak@...gle.com, justin.chen@...adcom.com,
florian.fainelli@...adcom.com, andrew+netdev@...n.ch,
kuba@...nel.org, kory.maincent@...tlin.com,
jacopo+renesas@...ndi.org, kieran.bingham+renesas@...asonboard.com,
maddy@...ux.ibm.com, mpe@...erman.id.au, npiggin@...il.com,
olteanv@...il.com, davem@...emloft.net, taras.chornyi@...ision.eu,
edumazet@...gle.com, pabeni@...hat.com, sudeep.holla@....com,
cristian.marussi@....com, arm-scmi@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, linux-media@...r.kernel.org,
netdev@...r.kernel.org, devicetree@...r.kernel.org,
chenjun102@...wei.com
Subject: Re: [PATCH 1/9] of: Add warpper function
of_find_node_by_name_balanced()
Hi Oleksij,
On Fri, Feb 07, 2025 at 09:24:10AM +0100, Oleksij Rempel wrote:
> On Fri, Feb 07, 2025 at 09:31:09AM +0800, Zhang Zekun wrote:
> > There are many drivers use of_find_node_by_name() with a not-NULL
> > device_node pointer, and a number of callers would require a call to
> > of_node_get() before using it. There are also some drivers who forget
> > to call of_node_get() which would cause a ref count leak[1]. So, Add a
> > wraper function for of_find_node_by_name(), drivers may use this function
> > to call of_find_node_by_name() with the refcount already balanced.
> >
> > [1] https://lore.kernel.org/all/20241024015909.58654-1-zhangzekun11@huawei.com/
>
> Hi Zhang Zekun,
>
> thank you for working on this issue!
>
> First of all, let's take a step back and analyze the initial problem.
> Everything following is only my opinion...
>
> The main issue I see is that the current API - of_find_node_by_name -
> modifies the refcount of its input by calling of_node_put(from) as part
> of its search. Typically, a "find" function is expected to treat its
> input as read-only. That is, when you pass an object into such a
> function, you expect its reference count to remain unchanged unless
> ownership is explicitly transferred. In this case, lowering the refcount
> on the input node is counterintuitive and already lead to unexpected
> behavior and subtle bugs.
>
> To address this, the workaround introduces a wrapper function,
> of_find_node_by_name_balanced, which first increments the input’s
> refcount (via of_node_get()) before calling the original function. While
> this "balances" the refcount change, the naming remains problematic from
> my perspective. The "_balanced" suffix isn’t part of our common naming
> conventions (traditions? :)). Most drivers expect that a function
> starting with "find" will not alter the reference count of its input.
> The term "balanced" doesn’t clearly convey that the input's refcount is
> being explicitly managed - it instead obscures the underlying behavior,
> leaving many developers confused about what guarantees the API provides.
>
> In my view, a more natural solution would be to redesign the API so that
> it doesn’t modify the input object’s refcount at all. Instead, it should
> solely increase the refcount of the returned node (if found) for safe
> asynchronous usage. This approach would align with established
> conventions where "find" implies no side effects on inputs or output,
> and a "get" indicates that the output comes with an extra reference. For
> example, a function named of_get_node_by_name would clearly signal that
> only the returned node is subject to a refcount increase while leaving
> the input intact.
>
> Thus, while the current workaround "balances" the reference count, it
> doesn't address the underlying design flaw. The naming still suggests a
> "find" function that should leave the input untouched, which isn’t the
> case here. A redesign of the API - with both the behavior and naming
> aligned to common expectations - would be a clearer and more robust
> solution.
>
> Nevertheless, it is only my POV, and the final decision rests with the
> OpenFirmware framework maintainers.
I agree overall that the naming is not optimal. Looking at the other
patches in the series, I think at least some of them misuse
of_find_node_by_name(). For instance, drivers/media/i2c/max9286.c calls
the function to find a *child* node of the device's of_node named
"i2c-mux", while of_find_node_by_name() looks at children first but will
then walk the *whole* DT to find a named node. I haven't checked all
patches, but other ones seem to suffer from the same misuse.
Assuming that the named node those drivers are looking for is a direct
child of the node passed as argument to of_find_node_by_name(), the
right fix would tbe to use of_get_child_by_name(). If it's not a direct
child, a recursive version of of_get_child_by_name() could be useful.
--
Regards,
Laurent Pinchart
Powered by blists - more mailing lists