[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3c741dc9d869212dec10049e8b0167ff350c5e0.camel@gmail.com>
Date: Tue, 27 Feb 2024 20:28:45 +0100
From: Nuno Sá <noname.nuno@...il.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Herve Codina <herve.codina@...tlin.com>, Saravana Kannan
<saravanak@...gle.com>, Luca Ceresoli <luca.ceresoli@...tlin.com>, Nuno Sa
<nuno.sa@...log.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Rob
Herring <robh+dt@...nel.org>, Frank Rowand <frowand.list@...il.com>, Lizhi
Hou <lizhi.hou@....com>, Max Zhen <max.zhen@....com>, Sonal Santan
<sonal.santan@....com>, Stefano Stabellini <stefano.stabellini@...inx.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
linux-kernel@...r.kernel.org, devicetree@...r.kernel.org, Allan Nielsen
<allan.nielsen@...rochip.com>, Horatiu Vultur
<horatiu.vultur@...rochip.com>, Steen Hegelund
<steen.hegelund@...rochip.com>, Thomas Petazzoni
<thomas.petazzoni@...tlin.com>, Android Kernel Team
<kernel-team@...roid.com>
Subject: Re: [PATCH 2/2] of: overlay: Synchronize of_overlay_remove() with
the devlink removals
On Tue, 2024-02-27 at 20:13 +0100, Rafael J. Wysocki wrote:
> On Tue, Feb 27, 2024 at 8:08 PM Nuno Sá <noname.nuno@...il.com> wrote:
> >
> > Hi Herve,
> >
> > On Tue, 2024-02-27 at 18:54 +0100, Herve Codina wrote:
> > > Hi Nuno,
> > >
> > > On Tue, 27 Feb 2024 17:55:07 +0100
> > > Nuno Sá <noname.nuno@...il.com> wrote:
> > >
> > > > On Tue, 2024-02-27 at 16:24 +0100, Herve Codina wrote:
> > > > > Hi Saravana, Luca, Nuno,
> > > > >
> > > > > On Tue, 20 Feb 2024 16:37:05 -0800
> > > > > Saravana Kannan <saravanak@...gle.com> wrote:
> > > > >
> > > > > ...
> > > > >
> > > > > > >
> > > > > > > diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> > > > > > > index a9a292d6d59b..5c5f808b163e 100644
> > > > > > > --- a/drivers/of/overlay.c
> > > > > > > +++ b/drivers/of/overlay.c
> > > > > > > @@ -1202,6 +1202,12 @@ int of_overlay_remove(int *ovcs_id)
> > > > > > > goto out;
> > > > > > > }
> > > > > > >
> > > > > > > + /*
> > > > > > > + * Wait for any ongoing device link removals before removing
> > > > > > > some
> > > > > > > of
> > > > > > > + * nodes
> > > > > > > + */
> > > > > > > + device_link_wait_removal();
> > > > > > > +
> > > > > >
> > > > > > Nuno in his patch[1] had this "wait" happen inside
> > > > > > __of_changeset_entry_destroy(). Which seems to be necessary to not hit
> > > > > > the issue that Luca reported[2] in this patch series. Is there any
> > > > > > problem with doing that?
> > > > > >
> > > > > > Luca for some reason did a unlock/lock(of_mutex) in his test patch and
> > > > > > I don't think that's necessary.
> > > > >
> > > > > I think the unlock/lock in Luca's case and so in Nuno's case is needed.
> > > > >
> > > > > I do the device_link_wait_removal() wihout having the of_mutex locked.
> > > > >
> > > > > Now, suppose I do the device_link_wait_removal() call with the of_mutex
> > > > > locked.
> > > > > The following flow is allowed and a deadlock is present.
> > > > >
> > > > > of_overlay_remove()
> > > > > lock(of_mutex)
> > > > > device_link_wait_removal()
> > > > >
> > > > > And, from the workqueue jobs execution:
> > > > > ...
> > > > > device_put()
> > > > > some_driver->remove()
> > > > > of_overlay_remove() <--- The job will never end.
> > > > > It is waiting for of_mutex.
> > > > > Deadlock
> > > > >
> > > >
> > > > We may need some input from Saravana (and others) on this. I might be missing
> > > > something but can a put_device() lead into a driver remove callback? Driver
> > > > code
> > > > is
> > > > not device code and put_device() leads to device_release() which will either
> > > > call
> > > > the
> > > > device ->release(), ->type->release() or the class ->dev_release(). And, IMO,
> > > > calling
> > > > of_overlay_remove() or something like that (like something that would lead to
> > > > unbinding a device from it's driver) in a device release callback would be at
> > > > the
> > > > very least very questionable. Typically, what you see in there is
> > > > of_node_put()
> > > > and
> > > > things like kfree() of the device itself or any other data.
> > >
> > > I think that calling of_overlay_remove() in a device release callback makes
> > > sense. The overlay is used to declare sub-nodes from the device node. It
> > > does not add/remove the device node itself but sub-nodes.
> > >
> >
> > I think we are speaking about two different things... device release is not the
> > same
> > as the driver remove callback. I admit the pci case seems to be a beast of it's
> > own
> > and I just spent some time (given your links) on it so I can't surely be sure
> > about
> > what I'm about to say... But, AFAICT, I did not saw any overlay or changeset
> > being
> > removed from a kobj_type release callback.
> >
> > > The use case is the use of DT overlays to describe PCI devices.
> > > https://lore.kernel.org/all/1692120000-46900-1-git-send-email-lizhi.hou@amd.com/
> > > https://lore.kernel.org/lkml/20220427094502.456111-1-clement.leger@bootlin.com/
> > > --- 8< ---
> > > The lan966x SoCs can be used in two different ways:
> > >
> > > - It can run Linux by itself, on ARM64 cores included in the SoC. This
> > > use-case of the lan966x is currently being upstreamed, using a
> > > traditional Device Tree representation of the lan996x HW blocks [1]
> > > A number of drivers for the different IPs of the SoC have already
> > > been merged in upstream Linux.
> > >
> > > - It can be used as a PCIe endpoint, connected to a separate platform
> > > that acts as the PCIe root complex. In this case, all the devices
> > > that are embedded on this SoC are exposed through PCIe BARs and the
> > > ARM64 cores of the SoC are not used. Since this is a PCIe card, it
> > > can be plugged on any platform, of any architecture supporting PCIe.
> > > --- 8< ---
> > >
> > > This quite long story led to DT overlay support for PCI devices and so the
> > > unittest I mentioned:
> > > https://elixir.bootlin.com/linux/v6.8-rc6/source/drivers/of/unittest.c#L3946
> > >
> > >
> > > So, I have a PCI driver that bind to the lan966x PCI board.
> > > This driver loads an overlay at probe() and unload it at remove().
> > > Also, this driver can be module. A simple rmmod leads to the remove() call.
> > >
> >
> > Hmm, and I think that would not be an issue... Note that the code that runs in
> > device_link_release_fn() is doing put_device() which ends ups on the kobj_type
> > release callback and so far I could not see any evidence of such a callback being
> > responsible of calling device_remove() on another device. That would be weird (I
> > think) since I would expect such a call to happen in a kind of unregister
> > function.
> >
> > > This driver is not yet upstream because I haven't yet fixed all the issues I
> > > encountered that's why of now, I can point only the unittest related to overlay
> > > support for PCI.
> > >
> > > >
> > > > The driver remove callback should be called when unbinding the device from
> > > > it's
> > > > drivers and devlinks should also be removed after device_unbind_cleanup()
> > > > (i.e,
> > > > after
> > > > the driver remove callback).
> > > >
> > > > Having said the above, the driver core has lots of subtleties so, again, I
> > > > can be
> > > > missing something. But at this point I'm still not seeing any deadlock...
> > > >
> > >
> > > I gave a wrong example.
> > > Based on Luca's sequence he gave in
> > > https://lore.kernel.org/all/20231220181627.341e8789@booty/
> >
> > Regarding Luca's comments, my first approach was actually to just make the
> > devlink
> > removal synchronously... I'm still not sure what would be the issue of doing that
> > (other than potentially waiting some time for the srcu synchronization).
>
> It would allow forward progress to be made, but it would add potential
> delay for everybody, which is only really needed in the DT overlay
> case. Sounds like something to avoid to me.
I mean, we could surely detect (or have a way to do it) if the devlink is being
removed as part of an overlay removal and only call device_link_release_fn()
synchronously in that case. It would minimize the effects I guess.
But yeah, we still need to make sure there's an actual deadlock... I'm still not
seeing it but Herve may very well be correct about it.
>
> > About the unlock, I'm just not sure what could happen if someone else (other than
> > us) sneaks in
> > and grabs the of_mutex while we are in the middle of removing an overlay...
>
> If that is possible at all, I'd call it a bug.
I think, in theory, it could happen as it looks to be a fairly coarse grained lock
for OF:
https://elixir.bootlin.com/linux/latest/source/drivers/of/of_private.h#L40
- Nuno Sá
Powered by blists - more mailing lists