[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx_j4613QjHgX5AJ96Ux6MJSxxhT7DL36yzNv1JCsoxTAA@mail.gmail.com>
Date: Thu, 22 Feb 2024 15:51:15 -0800
From: Saravana Kannan <saravanak@...gle.com>
To: Bartosz Golaszewski <brgl@...ev.pl>
Cc: Herve Codina <herve.codina@...tlin.com>, Kent Gibson <warthog618@...il.com>,
Linus Walleij <linus.walleij@...aro.org>, linux-gpio@...r.kernel.org,
linux-kernel@...r.kernel.org, Luca Ceresoli <luca.ceresoli@...tlin.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>
Subject: Re: [PATCH 2/2] gpiolib: cdev: release IRQs when the gpio chip device
is removed
On Thu, Feb 22, 2024 at 4:21 AM Bartosz Golaszewski <brgl@...ev.pl> wrote:
>
> On Thu, Feb 22, 2024 at 12:36 PM Herve Codina <herve.codina@...tlin.com> wrote:
> >
> > Hi Bartosz,
> >
> > On Thu, 22 Feb 2024 00:31:08 -0800
> > Bartosz Golaszewski <brgl@...ev.pl> wrote:
> >
> > > On Thu, 22 Feb 2024 02:05:30 +0100, Kent Gibson <warthog618@...il.com> said:
> > > > On Thu, Feb 22, 2024 at 08:57:44AM +0800, Kent Gibson wrote:
> > > >> On Tue, Feb 20, 2024 at 10:29:59PM +0800, Kent Gibson wrote:
> > > >> > On Tue, Feb 20, 2024 at 12:10:18PM +0100, Herve Codina wrote:
> > > >>
> > > >> ...
> > > >>
> > > >> > > }
> > > >> > >
> > > >> > > +static int linereq_unregistered_notify(struct notifier_block *nb,
> > > >> > > + unsigned long action, void *data)
> > > >> > > +{
> > > >> > > + struct linereq *lr = container_of(nb, struct linereq,
> > > >> > > + device_unregistered_nb);
> > > >> > > + int i;
> > > >> > > +
> > > >> > > + for (i = 0; i < lr->num_lines; i++) {
> > > >> > > + if (lr->lines[i].desc)
> > > >> > > + edge_detector_stop(&lr->lines[i]);
> > > >> > > + }
> > > >> > > +
> > > >> >
> > > >> > Firstly, the re-ordering in the previous patch creates a race,
> > > >> > as the NULLing of the gdev->chip serves to numb the cdev ioctls, so
> > > >> > there is now a window between the notifier being called and that numbing,
> > > >> > during which userspace may call linereq_set_config() and re-request
> > > >> > the irq.
> > > >> >
> > > >> > There is also a race here with linereq_set_config(). That can be prevented
> > > >> > by holding the lr->config_mutex - assuming the notifier is not being called
> > > >> > from atomic context.
> > > >> >
> > > >>
> > > >> It occurs to me that the fixed reordering in patch 1 would place
> > > >> the notifier call AFTER the NULLing of the ioctls, so there will no longer
> > > >> be any chance of a race with linereq_set_config() - so holding the
> > > >> config_mutex semaphore is not necessary.
> > > >>
> > > >
> > > > NULLing -> numbing
> > > >
> > > > The gdev->chip is NULLed, so the ioctls are numbed.
> > > > And I need to let the coffee soak in before sending.
> > > >
> > > >> In which case this patch is fine - it is only patch 1 that requires
> > > >> updating.
> > > >>
> > > >> Cheers,
> > > >> Kent.
> > > >
> > >
> > > The fix for the user-space issue may be more-or-less correct but the problem is
> > > deeper and this won't fix it for in-kernel users.
> > >
> > > Herve: please consider the following DT snippet:
> > >
> > > gpio0 {
> > > compatible = "foo";
> > >
> > > gpio-controller;
> > > #gpio-cells = <2>;
> > > interrupt-controller;
> > > #interrupt-cells = <1>;
> > > ngpios = <8>;
> > > };
> > >
> > > consumer {
> > > compatible = "bar";
> > >
> > > interrupts-extended = <&gpio0 0>;
> > > };
> > >
> > > If you unbind the "gpio0" device after the consumer requested the interrupt,
> > > you'll get the same splat. And device links will not help you here (on that
> > > note: Saravana: is there anything we could do about it? Have you even
> > > considered making the irqchip subsystem use the driver model in any way? Is it
> > > even feasible?).
I did add support to irqchip to use the driver model. See
IRQCHIP_PLATFORM_DRIVER_BEGIN() and uses of it. So this makes sure
the probe ordering is correct.
But when I added that support, there was some pushback on making the
modules removable[1]. But that's why you'll see that the
IRQCHIP_PLATFORM_DRIVER_BEGIN() macro set .suppress_bind_attrs = true.
Do you have a way to unregister an interrupt controller in your
example? If so, how do you unregister it? It shouldn't be too hard to
extend those macros to add removal support. We could add a
IRQCHIP_MATCH2() that also takes in an exit() function op that gets
called on device unbind.
[1] - https://lore.kernel.org/lkml/86sghas7so.wl-maz@kernel.org/#t
> > >
> > > I would prefer this to be fixed at a lower lever than the GPIOLIB character
> > > device.
> >
> > I think this use case is covered.
> > When the consumer device related to the consumer DT node is added, a
> > consumer/supplier relationship is created:
> > parse_interrupts() parses the 'interrups-extended' property
> > https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/of/property.c#L1316
> > and so, of_link_to_phandle() creates the consumer/supplier link.
> > https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/of/property.c#L1316
> >
> > We that link present, if the supplier is removed, the consumer is removed
> > before.
> > The consumer should release the interrupt during its remove process (i.e
> > explicit in its .remove() or explicit because of a devm_*() call).
> >
> > At least, it is my understanding.
>
> Well, then it doesn't work, because I literally just tried it before
> sending my previous email.
For your gpio0 device, can you see why __device_release_driver()
doesn't end up calling device_links_unbind_consumers()?
Also, can you look at
/sys/class/devlink/<bus:gpio0-devicename>--<consumer device name>
folders and see what the status file says before you try to unbind the
gpio0 device? It should say "active".
> Please try it yourself, you'll see.
>
> Also: an interrupt controller may not even have a device consuming its
> DT node (see IRQCHIP_DECLARE()), what happens then?
Yeah, we are screwed in those cases. Ideally we are rejecting all
submissions for irqchip drivers that use IRQCHIP_DECLARE().
-Saravana
Powered by blists - more mailing lists