lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01f733ab81959e4cf847cbf1d521ad9d@kernel.org>
Date:   Wed, 20 Jan 2021 15:39:30 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     Saravana Kannan <saravanak@...gle.com>
Cc:     Linus Walleij <linus.walleij@...aro.org>,
        Bartosz Golaszewski <bgolaszewski@...libre.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jisheng Zhang <Jisheng.Zhang@...aptics.com>,
        Kever Yang <kever.yang@...k-chips.com>,
        Android Kernel Team <kernel-team@...roid.com>,
        linux-gpio@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] gpiolib: Bind gpio_device to a driver to enable
 fw_devlink=on by default

On 2021-01-18 20:38, Saravana Kannan wrote:
> On Sun, Jan 17, 2021 at 4:02 AM Marc Zyngier <maz@...nel.org> wrote:
>> 
>> Hi Saravana,
>> 
>> Thanks for posting this, much appreciated.
>> 
>> On Sat, 16 Jan 2021 01:14:11 +0000,
>> Saravana Kannan <saravanak@...gle.com> wrote:
>> >
>> > There are multiple instances of GPIO devictree nodes of the form:
>> >
>> > foo {
>> >       compatible = "acme,foo";
>> >       ...
>> >
>> >       gpio0: gpio0@...xxxxx {
>> >               compatible = "acme,bar";
>> >               ...
>> >               gpio-controller;
>> >       };
>> >
>> >       gpio1: gpio1@...xxxxx {
>> >               compatible = "acme,bar";
>> >               ...
>> >               gpio-controller;
>> >       };
>> >
>> >       ...
>> > }
>> >
>> > bazz {
>> >       my-gpios = <&gpio0 ...>;
>> > }
>> >
>> > Case 1: The driver for "foo" populates struct device for these gpio*
>> > nodes and then probes them using a driver that binds with "acme,bar".
>> > This lines up with how DT nodes with the "compatible" property are
>> > generally converted to struct devices and then registered with driver
>> > core to probe them. This also allows the gpio* devices to hook into all
>> > the driver core capabilities like runtime PM, probe deferral,
>> > suspend/resume ordering, device links, etc.
>> >
>> > Case 2: The driver for "foo" doesn't populate its child device nodes
>> > with "compatible" property and instead just loops through its child
>> > nodes and directly registers the GPIOs with gpiolib without ever
>> > populating a struct device or binding a driver to it.
>> 
>> That's not quite an accurate description. The gpiolib subsystem does
>> create a struct device. It doesn't register a driver though, which is
>> what causes the issue with fr_devlink (more on that below).
> 
> The devices created by gpiolib care are created for case 1 and case 2.
> They are just devices gpiolib uses to represent a virtual software
> device to hook different attributes to and expose them in sysfs. I'm
> not talking about those devices here. The devices I'm referring to are
> devices that represent the actual HW IP -- so what I'm saying is
> accurate.

Please read what you wrote : "without ever populating a struct device".
I stand by the "not quite accurate".

> 
>> >
>> > Drivers that follow the case 2 cause problems with fw_devlink=on.  This
>> > is because fw_devlink will prevent bazz from probing until there's a
>> > struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
>> > a GPIO supplier). Once the struct device is available, fw_devlink will
>> > create a device link between with gpio0 as the supplier and bazz as the
>> > consumer. After this point, the device link will prevent bazz from
>> > probing until its supplier (the gpio0 device) has bound to a driver.
>> > Once the supplier is bound to a driver, the probe of bazz is triggered
>> > automatically.
>> >
>> > Finding and refactoring all the instances of drivers that follow case 2
>> > will cause a lot of code churn and it not something that can be done in
>> > one shot. Examples of such instances are [1] [2].
>> >
>> > This patch works around this problem and avoids all the code churn by
>> > simply creating a stub driver to bind to the gpio_device. Since the
>> > gpio_device already points to the GPIO device tree node, this allows all
>> > the consumers to continue probing when the driver follows case 2.
>> >
>> > If/when all the old drivers are refactored, we can revert this
>> > patch.
>> 
>> My personal gripe with this approach is that it is an abrupt change in
>> the way DT and device model are mapped onto each other.
>> 
>> As far as I know (and someone please correct me if I am wrong), there
>> is zero expectation that a device/subdevice/random IP block described
>> by a node with a "compatible" property will end-up being managed by a
>> driver that is bound to that particular node.
>> 
>> The node/subnode division is a good way to express some HW boundaries,
>> but doesn't say anything about the way this should be handled in the
>> kernel. Assuming that everything containing a "compatible" string will
>> eventually be bound to a driver is IMO pretty limiting.
> 
> The default behavior of of_platform_populate() is to create a struct
> device for every node with "compatible" property. That's how top level
> devices (or children of simple-bus devices) are populated. IIRC,
> that's what a lot of other busses do too. So, if anything, not having
> a struct device for nodes with "compatible" property is an exception.

Again, that's not a description of the reality. The case we are talking
about here does have a struct device. The misconception you keep
repeating is that binding it to a driver is the only valid approach.

> 
> Honestly, if one has a driver that supports a HW IP, I don't see any
> reason it should operate outside of the device-driver model supported
> by driver core. The driver code is there for a reason -- to solve all
> the common problems faced by drivers.

News flash: this isn't the case. Most of the code I deal with cannot
be represented as a driver, because it is needed way earlier
than the device model.

> Operating outside of it just
> causes reinventing the wheel, things like playing chicken with
> initcalls to make sure drivers initialize their device in the right
> order, not working with deferred probe, etc. For example, the rockchip
> driver in your device (the one that follows case 2) tries to get some
> clocks. But if that fails with -EPROBE_DEFER, the driver has no way
> for it to recover and just doesn't register the GPIO anymore.
> 
> Obviously exceptions are allowed for devices that are needed before
> the driver core even comes up -- like early, clocks, irqs and timers
> for the kernel/scheduler to kick off.

There you go. Exceptions. Tons of. The trouble is, you are rewriting
the rules of the game. Except that you are about 10 year late to
the party. Forcing your changes on everyone by saying that was
perfectly valid a month ago is now illegal doesn't really fly.

Anyway, I said what I had to say. If platforms break with this
change, I'll expect it to be disabled in 5.12.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ