[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250212163320.24d30adb@booty>
Date: Wed, 12 Feb 2025 16:33:20 +0100
From: Luca Ceresoli <luca.ceresoli@...tlin.com>
To: Saravana Kannan <saravanak@...gle.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J. Wysocki"
<rafael@...nel.org>, Francesco <francesco.dolcini@...adex.com>, Geert
Uytterhoeven <geert@...ux-m68k.org>, Tomi Valkeinen
<tomi.valkeinen@...asonboard.com>, kernel-team@...roid.com,
linux-kernel@...r.kernel.org, Rob Herring <robh@...nel.org>, Krzysztof
Kozlowski <krzysztof.kozlowski@...aro.org>, Conor Dooley
<conor@...nel.org>, Hervé Codina <herve.codina@...tlin.com>
Subject: Re: [PATCH v3] driver core: fw_devlink: Stop trying to optimize
cycle detection logic
Hello,
On Fri, 6 Dec 2024 10:31:43 +0100
Luca Ceresoli <luca.ceresoli@...tlin.com> wrote:
> > After rebasing my work for the hotplug connector driver using device
> > tree overlays [0] on v6.13-rc1 I started getting these OF errors on
> > overlay removal:
> >
> > OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/panel-dsi-lvds
> > OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/backlight-addon
> > OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/battery-charger
> > OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/regulator-addon-5v0-sys
> > OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/regulator-addon-3v3-sys
> >
> > ...and many more. Exactly one per each device in the overlay 'devices'
> > node, each implemented by a platform driver.
> >
> > Bisecting found this patch is triggering these error messages, which
> > in fact disappear by reverting it.
> >
> > I looked at the differences in dmesg and /sys/class/devlink/ in the
> > "good" and "bad" cases, and found almost no differences. The only
> > relevant difference is in cycle detection for the panel node, which was
> > expected, but nothing about all the other nodes like regulators.
> >
> > Enabling debug messages in core.c also does not show significant
> > changes between the two cases, even though it's hard to be sure given
> > the verbosity of the log and the reordering of messages.
> >
> > I suspect the new version of the cycle removal code is missing an
> > of_node_get() somewhere, but that is not directly visible in the patch
> > diff itself.
>
> I collected some more info by adding a bit of logging for one of the
> affected devices.
>
> It looks like the of_node_get() and of_node_put() in the overlay
> loading phase are the same, even though not completely in the same
> order. So after overlay insertion we should have the same refcount with
> and without your patch.
>
> There is a difference on overlay removal however: an of_node_put() call
> is absent with 6.13-rc1 code (errors emitted), and becomes present by
> just reverting your patch (the "good" case). Here's the stack trace of
> this call:
>
> Call trace:
> show_stack+0x20/0x38 (C)
> dump_stack_lvl+0x74/0x90
> dump_stack+0x18/0x28
> of_node_put+0x50/0x70
> platform_device_release+0x24/0x68
> device_release+0x3c/0xa0
> kobject_put+0xa4/0x118
> device_link_release_fn+0x60/0xd8
> process_one_work+0x158/0x3c0
> worker_thread+0x2d8/0x3e8
> kthread+0x118/0x128
> ret_from_fork+0x10/0x20
>
> So for some reason device_link_release_fn() is not leading to a
> of_node_put() call after adding your patch.
>
> Quick code inspection did not show any useful info for me to understand
> more.
I just sent a patch fixing
this: https://lore.kernel.org/20250212-fix__fw_devlink_relax_cycles_missing_device_put-v1-1-41818c7d7722@bootlin.com
Luca
--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Powered by blists - more mailing lists