lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241204124826.2e055091@booty>
Date: Wed, 4 Dec 2024 12:48:26 +0100
From: Luca Ceresoli <luca.ceresoli@...tlin.com>
To: Saravana Kannan <saravanak@...gle.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J. Wysocki"
 <rafael@...nel.org>, Francesco <francesco.dolcini@...adex.com>, Geert
 Uytterhoeven <geert@...ux-m68k.org>, Tomi Valkeinen
 <tomi.valkeinen@...asonboard.com>, kernel-team@...roid.com,
 linux-kernel@...r.kernel.org, Rob Herring <robh@...nel.org>, Krzysztof
 Kozlowski <krzysztof.kozlowski@...aro.org>, Conor Dooley
 <conor@...nel.org>, Hervé Codina
 <herve.codina@...tlin.com>
Subject: Re: [PATCH v3] driver core: fw_devlink: Stop trying to optimize
 cycle detection logic

Hello Saravana,

+Cc. DT maintainers, Hervé

On Wed, 30 Oct 2024 10:10:07 -0700
Saravana Kannan <saravanak@...gle.com> wrote:

> In attempting to optimize fw_devlink runtime, I introduced numerous cycle
> detection bugs by foregoing cycle detection logic under specific
> conditions. Each fix has further narrowed the conditions for optimization.
> 
> It's time to give up on these optimization attempts and just run the cycle
> detection logic every time fw_devlink tries to create a device link.
> 
> The specific bug report that triggered this fix involved a supplier fwnode
> that never gets a device created for it. Instead, the supplier fwnode is
> represented by the device that corresponds to an ancestor fwnode.
> 
> In this case, fw_devlink didn't do any cycle detection because the cycle
> detection logic is only run when a device link is created between the
> devices that correspond to the actual consumer and supplier fwnodes.
> 
> With this change, fw_devlink will run cycle detection logic even when
> creating SYNC_STATE_ONLY proxy device links from a device that is an
> ancestor of a consumer fwnode.
> 
> Reported-by: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
> Closes: https://lore.kernel.org/all/1a1ab663-d068-40fb-8c94-f0715403d276@ideasonboard.com/
> Fixes: 6442d79d880c ("driver core: fw_devlink: Improve detection of overlapping cycles")
> Tested-by: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
> Signed-off-by: Saravana Kannan <saravanak@...gle.com>

After rebasing my work for the hotplug connector driver using device
tree overlays [0] on v6.13-rc1 I started getting these OF errors on
overlay removal:

OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/panel-dsi-lvds
OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/backlight-addon
OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/battery-charger
OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/regulator-addon-5v0-sys
OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /addon-connector/devices/regulator-addon-3v3-sys

...and many more. Exactly one per each device in the overlay 'devices'
node, each implemented by a platform driver.

Bisecting found this patch is triggering these error messages, which
in fact disappear by reverting it.

I looked at the differences in dmesg and /sys/class/devlink/ in the
"good" and "bad" cases, and found almost no differences. The only
relevant difference is in cycle detection for the panel node, which was
expected, but nothing about all the other nodes like regulators.

Enabling debug messages in core.c also does not show significant
changes between the two cases, even though it's hard to be sure given
the verbosity of the log and the reordering of messages.

I suspect the new version of the cycle removal code is missing an
of_node_get() somewhere, but that is not directly visible in the patch
diff itself.

Any clues?

[0] https://lore.kernel.org/all/20240917-hotplug-drm-bridge-v4-0-bc4dfee61be6@bootlin.com/

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ