lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 2 Feb 2021 11:46:51 -0800
From:   Saravana Kannan <saravanak@...gle.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Marc Zyngier <maz@...nel.org>,
        Tudor Ambarus <Tudor.Ambarus@...rochip.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        Bartosz Golaszewski <bgolaszewski@...libre.com>,
        Martin Kaiser <martin@...ser.cx>,
        Rob Herring <robh+dt@...nel.org>,
        Frank Rowand <frowand.list@...il.com>,
        Len Brown <lenb@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
        "Cc: Android Kernel" <kernel-team@...roid.com>
Subject: Re: [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers
 for optional suppliers

On Tue, Feb 2, 2021 at 6:34 AM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@...gle.com> wrote:
> >
> > After a deferred probe attempt has exhaused all the devices that can be
> > bound, any device that remains unbound has one/both of these conditions
> > true:
> >
> > (1) It is waiting on its supplier to bind
> > (2) It does not have a matching driver
> >
> > So, to make fw_devlink=on more forgiving of missing drivers for optional
> > suppliers, after we've done a full deferred probe attempt, this patch
> > deletes all device links created by fw_devlink where the supplier hasn't
> > probed yet and the supplier itself is not waiting on any of its
> > suppliers. This allows consumers to probe during another deferred probe
> > attempt if they were waiting on optional suppliers.
> >
> > When modules are enabled, we can't differentiate between a driver
> > that'll never be registered vs a driver that'll be registered soon by
> > loading a module. So, this patch doesn't do anything for the case where
> > modules are enabled.
> >
> > Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> > Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> > ---
> >  drivers/base/base.h |   2 +
> >  drivers/base/core.c | 104 ++++++++++++++++++++++++++++++++++++--------
> >  drivers/base/dd.c   |   5 +++
> >  3 files changed, 94 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > index f5600a83124f..34befe9475cb 100644
> > --- a/drivers/base/base.h
> > +++ b/drivers/base/base.h
> > @@ -186,6 +186,8 @@ extern void device_links_no_driver(struct device *dev);
> >  extern bool device_links_busy(struct device *dev);
> >  extern void device_links_unbind_consumers(struct device *dev);
> >
> > +bool fw_devlink_deferred_probe_retry(void);
> > +
> >  /* device pm support */
> >  void device_pm_move_to_tail(struct device *dev);
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index c95b1daabac7..5e53fc6a21ea 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -50,6 +50,7 @@ static LIST_HEAD(deferred_sync);
> >  static unsigned int defer_sync_state_count = 1;
> >  static DEFINE_MUTEX(fwnode_link_lock);
> >  static bool fw_devlink_is_permissive(void);
> > +static bool fw_devlink_def_probe_retry;
> >
> >  /**
> >   * fwnode_link_add - Create a link between two fwnode_handles.
> > @@ -881,6 +882,13 @@ static void device_link_put_kref(struct device_link *link)
> >                 WARN(1, "Unable to drop a managed device link reference\n");
> >  }
> >
> > +static void device_link_drop_managed(struct device_link *link)
> > +{
> > +       link->flags &= ~DL_FLAG_MANAGED;
> > +       WRITE_ONCE(link->status, DL_STATE_NONE);
> > +       kref_put(&link->kref, __device_link_del);
> > +}
> > +
> >  /**
> >   * device_link_del - Delete a stateless link between two devices.
> >   * @link: Device link to delete.
> > @@ -943,6 +951,29 @@ static void device_links_missing_supplier(struct device *dev)
> >         }
> >  }
> >
> > +/**
> > + * device_links_probe_blocked_by - Return first supplier blocking probe
> > + * @dev: Consumer device.
> > + *
> > + * Checks if the probe of @dev is blocked by a supplier without a driver. If
> > + * yes, return that supplier dev. Otherwise, return NULL.
> > + */
> > +static struct device *device_links_probe_blocked_by(struct device *dev)
> > +{
> > +       struct device_link *link;
> > +
> > +       list_for_each_entry(link, &dev->links.suppliers, c_node) {
> > +               if (!(link->flags & DL_FLAG_MANAGED) ||
> > +                   link->flags & DL_FLAG_SYNC_STATE_ONLY)
> > +                       continue;
> > +
> > +               if (link->status != DL_STATE_AVAILABLE)
> > +                       return link->supplier;
> > +       }
> > +
> > +       return NULL;
>
> This is slightly confusing, because you don't actually use the
> returned device pointer, but simply check it against NULL.
>
> AFAICS this function can return bool and I'd call it
> device_links_probe_blocked().

Yeah, I was writing it this way because I had other future uses for
this. But yeah, I'll just make it a bool and change it later when I
need to.

>
> > +}
> > +
> >  /**
> >   * device_links_check_suppliers - Check presence of supplier drivers.
> >   * @dev: Consumer device.
> > @@ -961,7 +992,7 @@ static void device_links_missing_supplier(struct device *dev)
> >   */
> >  int device_links_check_suppliers(struct device *dev)
> >  {
> > -       struct device_link *link;
> > +       struct device_link *link, *tmp;
> >         int ret = 0;
> >
> >         /*
> > @@ -982,19 +1013,47 @@ int device_links_check_suppliers(struct device *dev)
> >
> >         device_links_write_lock();
> >
> > -       list_for_each_entry(link, &dev->links.suppliers, c_node) {
> > +       list_for_each_entry_safe(link, tmp, &dev->links.suppliers, c_node) {
> >                 if (!(link->flags & DL_FLAG_MANAGED))
> >                         continue;
> >
> > -               if (link->status != DL_STATE_AVAILABLE &&
> > -                   !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
> > -                       device_links_missing_supplier(dev);
> > -                       dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> > -                               dev_name(link->supplier));
> > -                       ret = -EPROBE_DEFER;
> > -                       break;
> > +
> > +               if (link->status == DL_STATE_AVAILABLE ||
> > +                   link->flags & DL_FLAG_SYNC_STATE_ONLY) {
> > +                       WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
> > +                       continue;
> > +               }
> > +
> > +               /*
> > +                * After a deferred probe attempt has exhaused all the devices
> > +                * that can be bound, any device that remains unbound has
> > +                * one/both of these conditions true:
> > +                *
> > +                * (1) It is waiting on its supplier to bind
> > +                * (2) It does not have a matching driver
> > +                *
> > +                * If this device is waiting on a supplier to bind to a driver,
> > +                * we make sure condition (1) above is not true for the
> > +                * supplier. In which case, condition (2) has to be true for
> > +                * the supplier. That is, the supplier doesn't have a matching
> > +                * driver.
> > +                *
> > +                * When we find such a supplier, we delete the device link if
> > +                * it was created by fw_devlink. This it to allow the consumer
> > +                * to probe in case the supplier is an optional.
> > +                */
> > +               if (fw_devlink_def_probe_retry &&
>
> I would put a IS_ENABLED(CONFIG_MODULES) check here to let the
> compiler optimize out the code depending on it and make it clear that
> this is a NOP if there are modules.

Will do.

>
> > +                   link->flags & DL_FLAG_INFERRED &&
> > +                   !device_links_probe_blocked_by(link->supplier)) {
> > +                       device_link_drop_managed(link);
> > +                       continue;
> >                 }
> > -               WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
> > +
> > +               device_links_missing_supplier(dev);
> > +               dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> > +                       dev_name(link->supplier));
> > +               ret = -EPROBE_DEFER;
> > +               break;
> >         }
> >         dev->links.status = DL_DEV_PROBING;
> >
> > @@ -1132,13 +1191,6 @@ static void __device_links_supplier_defer_sync(struct device *sup)
> >                 list_add_tail(&sup->links.defer_sync, &deferred_sync);
> >  }
> >
> > -static void device_link_drop_managed(struct device_link *link)
> > -{
> > -       link->flags &= ~DL_FLAG_MANAGED;
> > -       WRITE_ONCE(link->status, DL_STATE_NONE);
> > -       kref_put(&link->kref, __device_link_del);
> > -}
> > -
> >  static ssize_t waiting_for_supplier_show(struct device *dev,
> >                                          struct device_attribute *attr,
> >                                          char *buf)
> > @@ -1597,6 +1649,24 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
> >         return ret;
> >  }
> >
> > +/** fw_devlink_deferred_probe_retry - Set up fw_devlink for probe retries
>
> Kerneldoc format mistake.

Ack

>
> > + *
> > + * This function requests fw_devlink to set itself up for a deferred probe
> > + * retry. This allows fw_devlink to ignore device links it created to
> > + * suppliers that'll never probe. This is necessary in case some of the
> > + * suppliers are optional and their consumers can probe without them.
> > + *
> > + * Returns true if deferred probe retry is likely to make any difference.
> > + */
> > +bool fw_devlink_deferred_probe_retry(void)
> > +{
> > +       if (IS_ENABLED(CONFIG_MODULES))
> > +               return false;
>
> To make the above more visible, I'd fold this function into the caller.

I had written it this way because I'm thinking of adding a timeout
heuristic for MODULES in here. I can move it to the caller if you feel
strongly about it.

>
> > +
> > +       fw_devlink_def_probe_retry = true;
> > +       return fw_devlink_get_flags() && !fw_devlink_is_permissive();
> > +}
> > +
> >  /**
> >   * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
> >   * @con - Consumer device for the device link
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 9179825ff646..11325df2327f 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
> >         driver_deferred_probe_trigger();
> >         /* Sort as many dependencies as possible before exiting initcalls */
> >         flush_work(&deferred_probe_work);
> > +
> > +       if (fw_devlink_deferred_probe_retry()) {
> > +               driver_deferred_probe_trigger();
> > +               flush_work(&deferred_probe_work);
> > +       }
> >         initcalls_done = true;
> >
> >         /*
> > --
>
> Overall, the "let's do nothing if modules are not enabled" approach is
> a bit disappointing, because what if somebody builds all of the
> drivers needed for boot in and enables modules anyway, for example to
> allow USB drivers to be probed dynamically?

Yeah, I'm disappointed too :( But I'm trying to get it to work for
!MODULES so that we can enable fw_devlink=on by default at least for
!MODULES to make sure drivers don't introduce more issues going
forward. And then I plan to continue working on making it work
correctly for MODULES case too.

Getting fw_devlink=on to work perfectly for MODULES and !MODULES is
not a problem at all. But it needs fixing a bunch of drivers (mostly
simple fixes like setting the right flag, handling deferred probes
correctly, etc), but I'm hitting a catch-22 here. I can't find the
drivers without setting fw_devlink=on by default. But if I did that,
it's going to break a bunch of boards.

What's your thought on leaving fw_devlink=on by default on 5.12 and
fixing drivers as issues are reported? If that's a no, do you have any
other ideas on how to deal with this catch-22?

Thanks,
Saravana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ