[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251210132140.32dbc3d7@bootlin.com>
Date: Wed, 10 Dec 2025 13:21:40 +0100
From: Herve Codina <herve.codina@...tlin.com>
To: Geert Uytterhoeven <geert@...ux-m68k.org>
Cc: Kalle Niemi <kaleposti@...il.com>, Rob Herring <robh@...nel.org>, Matti
Vaittinen <mazziesaccount@...il.com>, linux-arm-kernel@...ts.infradead.org,
Andrew Lunn <andrew@...n.ch>, Krzysztof Kozlowski <krzk+dt@...nel.org>,
Conor Dooley <conor+dt@...nel.org>, Greg Kroah-Hartman
<gregkh@...uxfoundation.org>, "Rafael J. Wysocki" <rafael@...nel.org>,
Danilo Krummrich <dakr@...nel.org>, Shawn Guo <shawnguo@...nel.org>, Sascha
Hauer <s.hauer@...gutronix.de>, Pengutronix Kernel Team
<kernel@...gutronix.de>, Fabio Estevam <festevam@...il.com>, Michael
Turquette <mturquette@...libre.com>, Stephen Boyd <sboyd@...nel.org>, Andi
Shyti <andi.shyti@...nel.org>, Wolfram Sang
<wsa+renesas@...g-engineering.com>, Peter Rosin <peda@...ntia.se>, Arnd
Bergmann <arnd@...db.de>, Bjorn Helgaas <bhelgaas@...gle.com>, Charles
Keepax <ckeepax@...nsource.cirrus.com>, Richard Fitzgerald
<rf@...nsource.cirrus.com>, David Rhodes <david.rhodes@...rus.com>, Linus
Walleij <linus.walleij@...aro.org>, Ulf Hansson <ulf.hansson@...aro.org>,
Mark Brown <broonie@...nel.org>, Andy Shevchenko
<andriy.shevchenko@...ux.intel.com>, Daniel Scally <djrscally@...il.com>,
Heikki Krogerus <heikki.krogerus@...ux.intel.com>, Sakari Ailus
<sakari.ailus@...ux.intel.com>, Len Brown <lenb@...nel.org>, Davidlohr
Bueso <dave@...olabs.net>, Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>, Alison Schofield
<alison.schofield@...el.com>, Vishal Verma <vishal.l.verma@...el.com>, Ira
Weiny <ira.weiny@...el.com>, Dan Williams <dan.j.williams@...el.com>,
Wolfram Sang <wsa@...nel.org>, devicetree@...r.kernel.org,
linux-kernel@...r.kernel.org, imx@...ts.linux.dev,
linux-clk@...r.kernel.org, linux-i2c@...r.kernel.org,
linux-pci@...r.kernel.org, linux-sound@...r.kernel.org,
patches@...nsource.cirrus.com, linux-gpio@...r.kernel.org,
linux-pm@...r.kernel.org, linux-spi@...r.kernel.org,
linux-acpi@...r.kernel.org, linux-cxl@...r.kernel.org, Allan Nielsen
<allan.nielsen@...rochip.com>, Horatiu Vultur
<horatiu.vultur@...rochip.com>, Steen Hegelund
<steen.hegelund@...rochip.com>, Luca Ceresoli <luca.ceresoli@...tlin.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>
Subject: Re: [PATCH v4 01/29] Revert "treewide: Fix probing of devices in DT
overlays"
Hi Geert, Kalle, Rob,
On Thu, 4 Dec 2025 11:49:13 +0100
Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
> Hi Hervé,
>
> On Thu, 4 Dec 2025 at 08:39, Herve Codina <herve.codina@...tlin.com> wrote:
> > Indeed, Kalle, Geert, I don't have your hardware, your related overlay or
> > a similar one that could be used for test and also I don't have your out of
> > tree code used to handle this overlay.
> >
> > I know overlays and fw_devlink have issues. Links created by fw_devlink
> > when an overlay is applied were not correct on my side.
> >
> > Can you check your <supplier>--<consumer> links with 'ls /sys/class/devlinks'
> >
> > On my side, without my patches some links were not correct.
> > They linked to the parent of the supplier instead of the supplier itself.
> > The consequence is a kernel crash, use after free, refcounting failure, ...
> > when the supplier device is removed.
> >
> > Indeed, with wrong links consumers were not removed before suppliers they
> > used.
> >
> > Looking at Geert traces:
> > --- 8< ---
> > rcar_sound ec500000.sound: Failed to create device link (0x180) with
> > supplier soc for /soc/sound@...00000/rcar_sound,src/src-0
> > rcar_sound ec500000.sound: Failed to create device link (0x180) with
> > supplier soc for /soc/sound@...00000/rcar_sound,src/src-1
> > [...]
> > --- 8< ---
> >
> > Even if it is not correct, why the soc device cannot be a provider?
> > I don't have the answer to this question yet.
>
> I have no idea. These failures (sound) are also not related to the
> device I am adding through the overlay (SPI EEPROM).
> Note that these failures appear only with your suggested fix, and are
> not seen with just the patch in the subject of this email thread.
>
> > Without having the exact tree structure of the base device-tree, the overlay
> > and the way it is applied, and so without been able to reproduce the issue
> > on my side, investigating the issue is going to be difficult.
> >
> > I hope to find some help to move forward and fix the issue.
>
> Base DTS is [1], overlay DTS is [2].
> Applying and removing the overlay is done using OF_CONFIGFS[3],
> and "overlay [add|rm] 25lc040"[4].
>
> I assume you can reproduce the issue on any board that has an SPI
> EEPROM, after moving the SPI bus enablement and SPI EEPROM node to an
> overlay. Probably even with an I2C EEPROM instead. Or even without
> an actual EEPROM connected, as even the SPI bus fails to appear.
>
> > Saravana's email (Saravana Kannan <saravanak@...gle.com>) seems incorrect.
> > Got emails delivery failure with this email address.
>
> Yeah, he moved company.
> He is still alive, I met him in the LPC Training Session yesterday ;-)
>
> Thanks!
>
> [1] https://web.git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/tree/arch/arm64/boot/dts/renesas/r8a77990-ebisu.dts
> [2] https://web.git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/tree/arch/arm64/boot/dts/renesas/r8a77990-ebisu-cn41-msiof0-25lc040.dtso?h=topic/renesas-overlays-v6.17-rc1
> [3] https://web.git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/log/?h=topic/overlays-v6.17-rc1
> [4] https://elinux.org/R-Car/DT-Overlays#Helper_Script
> [5] https://lore.kernel.org/CAMuHMdXEnSD4rRJ-o90x4OprUacN_rJgyo8x6=9F9rZ+-KzjOg@mail.gmail.com/
>
I did some tests with boards I have.
First I used a Marvel board based on an Armada 3720.
In my overlay, I added the pinmux related to the SPI controller, enabled
this SPI controller and added a SPI flash.
It didn't work with or without culprit patches from my series applied.
Indeed, the pinctrl driver used is an MFD driver an mixed pinmux definition
nodes with device description (a clock) node.
When a new node is added, a new device is created. Indeed, because the
driver is an MFD driver, it is a bus driver and handled by of_platform bus.
My new node is considered by devlink as a node that will have a device ready
to work (driver attached and device probed). A link is created between this
node and the consumers of this node (i.e. the SPI controller). devlink is
waiting for this provider to be ready before allowing the its consumer to probe.
This node (simple pinmux description) will never lead to a device and devlink
will never see this "provider" ready.
Did a test with a Renesas RZ/N1D (r9a06g032) based board and built a similar
overlay involving I2C controller pinmux, I2C controller and an EEPROM.
Here, also the overlay didn't work but the issue is different.
The pinmux definition for pinctrl (i.e. pinctrl subnodes) are looked when
the pinctrl driver probes. Adding a new node later is not handled by the
pinctrl driver.
Applying the overlay leads to a simple:
[ 16.934168] rzn1-pinctrl 40067000.pinctrl: unable to find group for node /soc/pinctrl@...67000/pins_i2c2
Indeed, the 'pins_i2c2' has been added by the overlay and was not present
when the pinctrl probed.
Tried without adding a new pinmux node (pinctrl subnode) from the overlay
and used nodes already existing in the base DT.
On my Marvell Armada 3720 board, it works with or without my patches.
No regression detected due to my patches.
On my RZ/N1D board, it works also with or without my patches.
Here also, no regression detected.
Also, on my Marvell Armada 3720 board, I can plug my LAN966x PCI board.
The LAN966x PCI driver used an overlay to describe the LAN966x PCI board.
With the upstream patch not reverted, i.e. 1a50d9403fb9 ("treewide: Fix
probing of devices in DT overlays")" applied, devlinks created for the
LAN966x PCI board internal devices are incorrect and lead to crashes when
the LAN966x PCI driver is removed due to wrong provider/consumer dependencies.
When this patch is reverted and replaced by "of: dynamic: Fix overlayed
devices not probing because of fw_devlink", devlinks created for the LAN966x
PCI board internal devices are corrects and crashes are no more present on
removal.
Kalle, Geert, can you perform a test on your hardware with my patches
applied and moving your pinmux definition from the overlay to the base
device-tree?
The kernel you can use is for instance the kernel at the next-20251127 tag.
Needed patches for test are present in this kernel:
- 76841259ac092 ("of: dynamic: Fix overlayed devices not probing because of fw_devlink")
- 7d67ddc5f0148 ("Revert "treewide: Fix probing of devices in DT overlays"")
Best regards,
Hervé
Powered by blists - more mailing lists