lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 7 Aug 2019 19:05:11 -0700
From:   Frank Rowand <frowand.list@...il.com>
To:     Saravana Kannan <saravanak@...gle.com>,
        Rob Herring <robh+dt@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
        David Collins <collinsd@...eaurora.org>,
        kernel-team@...roid.com
Subject: Re: [PATCH v7 2/7] driver core: Add edit_links() callback for drivers

> Date: Tue, 23 Jul 2019 17:10:55 -0700
> Subject: [PATCH v7 2/7] driver core: Add edit_links() callback for drivers
> From: Saravana Kannan <saravanak@...gle.com>
> 
> The driver core/bus adding supplier-consumer dependencies by default

> enables functional dependencies to be tracked correctly even when the
> consumer devices haven't had their drivers registered or loaded (if they
> are modules).

  enables functional dependencies to be tracked correctly before the
  consumer device drivers are registered or loaded (if they are modules).

> 
> However, when the bus incorrectly adds dependencies that it shouldn't

                    ^^^ driver core/bus

> have added, the devices might never probe.

Explain what causes a  dependency to be incorrectly added.

Is this a bug in the dependency detection code?

Are there cases where the dependency detection code can not reliably determine
whether there truly is a dependency?

> 
> For example, if device-C is a consumer of device-S and they have
> phandles to each other in DT, the following could happen:
> 
> 1.  Device-S get added first.
> 2.  The bus add_links() callback will (incorrectly) try to link it as
>     a consumer of device-C.
> 3.  Since device-C isn't present, device-S will be put in
>     "waiting-for-supplier" list.
> 4.  Device-C gets added next.
> 5.  All devices in "waiting-for-supplier" list are retried for linking.
> 6.  Device-S gets linked as consumer to Device-C.
> 7.  The bus add_links() callback will (correctly) try to link it as
>     a consumer of device-S.
> 8.  This isn't allowed because it would create a cyclic device links.
> 
> Neither devices will get probed since the supplier is marked as
> dependent on the consumer. And the consumer will never probe because the
> consumer can't get resources from the supplier.
> 
> Without this patch, things stay in this broken state. However, with this
> patch, the execution will continue like this:
> 
> 9.  Device-C's driver is loaded.

Change comment to:
  
  For example, if device-C is a consumer of device-S and they have phandles
  referencing each other in the devicetree, the following could happen:

  1.  Device-S is added first.
        - The bus add_links() callback will (incorrectly) link device-S
          as a consumer of device-C, and device-S will be put in the
          "wait_for_suppliers" list.

  2.  Device-C is added next.
        - All devices in the "wait_for_suppliers" list are retried for linking.
        - Device-S remains linked as a consumer to device-C.
        - The bus add_links() callback will (correctly) try to link device-C as
          a consumer of device-S.
        - The link attempt will fail because it would create a cyclic device
          link, and device-C will be put in the "wait_for_suppliers" list.

  Device-S will not be probed because it is in the "wait_for_suppliers" list.
  Device-C will not be probed because it is in the "wait_for_suppliers" list.
  
> 
> Without this patch, things stay in this broken state. However, with this
> patch, the execution will continue like this:
> 
> 9.  Device-C's driver is loaded.

What is "loaded"?  Does that mean the device-C probe succeeds?

What causes device-C to be probed?  The normal processing of -EPROBE_DEFER
devices?


> 10. Device-C's driver removes Device-S as a consumer of Device-C.
> 11. Device-C's driver adds Device-C as a consumer of Device-S.
> 12. Device-S probes.
> 14. Device-C probes.
> 
> Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> ---
>  drivers/base/core.c    | 24 ++++++++++++++++++++++--
>  drivers/base/dd.c      | 29 +++++++++++++++++++++++++++++
>  include/linux/device.h | 18 ++++++++++++++++++
>  3 files changed, 69 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 1b4eb221968f..733d8a9aec76 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -422,6 +422,19 @@ static void device_link_wait_for_supplier(struct device *consumer)
>  	mutex_unlock(&wfs_lock);
>  }
>  
> +/**
> + * device_link_remove_from_wfs - Unmark device as waiting for supplier
> + * @consumer: Consumer device
> + *
> + * Unmark the consumer device as waiting for suppliers to become available.
> + */
> +void device_link_remove_from_wfs(struct device *consumer)

Misleading function name.
Incorrect description.

Does not remove consumer from list wait_for_suppliers.

At best, consumer might eventually get removed from list wait_for_suppliers
if device_link_check_waiting_consumers() is called again.

> +{
> +	mutex_lock(&wfs_lock);
> +	list_del_init(&consumer->links.needs_suppliers);
> +	mutex_unlock(&wfs_lock);
> +}
> +
>  /**
>   * device_link_check_waiting_consumers - Try to unmark waiting consumers
>   *
> @@ -439,12 +452,19 @@ static void device_link_wait_for_supplier(struct device *consumer)
>  static void device_link_check_waiting_consumers(void)
>  {
>  	struct device *dev, *tmp;
> +	int ret;
>  
>  	mutex_lock(&wfs_lock);
>  	list_for_each_entry_safe(dev, tmp, &wait_for_suppliers,
> -				 links.needs_suppliers)
> -		if (!dev->bus->add_links(dev))
> +				 links.needs_suppliers) {
> +		ret = 0;
> +		if (dev->has_edit_links)
> +			ret = driver_edit_links(dev);
> +		else if (dev->bus->add_links)
> +			ret = dev->bus->add_links(dev);
> +		if (!ret)
>  			list_del_init(&dev->links.needs_suppliers);
> +	}
>  	mutex_unlock(&wfs_lock);
>  }
>  
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 994a90747420..5e7041ede0d7 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -698,6 +698,12 @@ int driver_probe_device(struct device_driver *drv, struct device *dev)
>  	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
>  		 drv->bus->name, __func__, dev_name(dev), drv->name);
>  
> +	if (drv->edit_links) {
> +		if (drv->edit_links(dev))
> +			dev->has_edit_links = true;
> +		else
> +			device_link_remove_from_wfs(dev);
> +	}

For the purposes of the following paragraphs, I refer to dev as "dev_1" to
distinguish it from a new dev that will be encountered later.  The following
paragraphs assume dev_1 has a supplier dependency for a supplier that has
not probed yet.

Q. Why the extra level of indirection?

A. really_probe() does not set dev->driver before returning if
   device_links_check_suppliers() returned -EPROBE_DEFER.  Thus
   device_link_check_waiting_consumers() can not directly check
   "if (dev_1->driver->edit_links)".

   The added driver_probe_device() code is setting dev_1->has_edit_links in the
   probe path, then device_link_check_waiting_consumers() will use the value
   of dev_1->has_edit_links instead of directly checking
   "if (dev_1->driver->edit_links)".

   If really_probe() was modified to set dev->driver in this
   case then the need for dev->has_edit_links is removed and
   driver_edit_links() is not needed, since dev->driver would
   be available.  Removing driver_edit_links() simplifies the
   code.

device_add() calls dev_1->bus->add_links(dev_1), thus dev_1 will have the
supplier links set (for any suppliers not currently available) and be on
list wait_for_suppliers.

Then device_add() calls bus_probe_device(), leading to calling
driver_probe_device().  The above code fragment either sets
dev_1->has_edit_links or removes the needs_suppliers links from dev_1.
dev_1 remains on list wait_for_suppliers.

If (drv->edit_links(dev_1) returns 0 then device_link_remove_from_wfs()
removes the supplier links.  Shouldn't device_link_remove_from_wfs()  also
remove the device from the list wait_for_suppliers?

The next time a device is added, device_link_check_waiting_consumers() will
be called and dev_1 will be on list wait_for_suppliers, thus
device_link_check_waiting_consumers() will find dev_1->has_edit_links true
and thus call driver_edit_links() instead of calling dev->bus->add_links().

The comment in device.h, later in this patch, says that drv->edit_links() is
responsible for editing the device links for dev.  The comment provides no
guidance on how drv->edit_links() is supposed to determine what edits to
perform.  No example drv->edit_links() function is provided in this patch
series.  dev_1->bus->add_links(dev_1) may have added one or more suppliers
to its needs_suppliers link.  drv->edit_links() needs to be able to handle
all possible variants of what suppliers are on the needs_suppliers link.


>  	pm_runtime_get_suppliers(dev);
>  	if (dev->parent)
>  		pm_runtime_get_sync(dev->parent);
> @@ -786,6 +792,29 @@ struct device_attach_data {
>  	bool have_async;
>  };
>  
> +static int __driver_edit_links(struct device_driver *drv, void *data)
> +{
> +	struct device *dev = data;
> +
> +	if (!drv->edit_links)
> +		return 0;
> +
> +	if (driver_match_device(drv, dev) <= 0)
> +		return 0;
> +
> +	return drv->edit_links(dev);
> +}
> +
> +int driver_edit_links(struct device *dev)
> +{
> +	int ret;
> +
> +	device_lock(dev);
> +	ret = bus_for_each_drv(dev->bus, NULL, dev, __driver_edit_links);
> +	device_unlock(dev);
> +	return ret;
> +}
> +
>  static int __device_attach_driver(struct device_driver *drv, void *_data)
>  {
>  	struct device_attach_data *data = _data;
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 5d70babb7462..35aed50033c4 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -263,6 +263,20 @@ enum probe_type {
>   * @probe_type:	Type of the probe (synchronous or asynchronous) to use.
>   * @of_match_table: The open firmware table.
>   * @acpi_match_table: The ACPI match table.
> + * @edit_links:	Called to allow a matched driver to edit the device links the

Where is the value of field edit_links set?

Is it only in an out of tree driver?  If so, I would like to see an
example implementation of the edit_links() function.


> + *		bus might have added incorrectly. This will be useful to handle
> + *		cases where the bus incorrectly adds functional dependencies
> + *		that aren't true or tries to create cyclic dependencies. But
> + *		doesn't correctly handle functional dependencies that are
> + *		missed by the bus as the supplier's sync_state might get to
> + *		execute before the driver for a missing consumer is loaded and
> + *		gets to edit the device links for the consumer.
> + *
> + *		This function might be called multiple times after a new device
> + *		is added.  The function is expected to create all the device
> + *		links for the new device and return 0 if it was completed
> + *		successfully or return an error if it needs to be reattempted
> + *		in the future.
>   * @probe:	Called to query the existence of a specific device,
>   *		whether this driver can work with it, and bind the driver
>   *		to a specific device.
> @@ -302,6 +316,7 @@ struct device_driver {
>  	const struct of_device_id	*of_match_table;
>  	const struct acpi_device_id	*acpi_match_table;
>  
> +	int (*edit_links)(struct device *dev);
>  	int (*probe) (struct device *dev);
>  	int (*remove) (struct device *dev);
>  	void (*shutdown) (struct device *dev);
> @@ -1078,6 +1093,7 @@ struct device {
>  	bool			offline_disabled:1;
>  	bool			offline:1;
>  	bool			of_node_reused:1;
> +	bool			has_edit_links:1;

Add has_edit_links to the struct's kernel_doc


>  #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
>      defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
>      defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
> @@ -1329,6 +1345,7 @@ extern int  __must_check device_attach(struct device *dev);
>  extern int __must_check driver_attach(struct device_driver *drv);
>  extern void device_initial_probe(struct device *dev);
>  extern int __must_check device_reprobe(struct device *dev);
> +extern int driver_edit_links(struct device *dev);
>  
>  extern bool device_is_bound(struct device *dev);
>  
> @@ -1419,6 +1436,7 @@ struct device_link *device_link_add(struct device *consumer,
>  				    struct device *supplier, u32 flags);
>  void device_link_del(struct device_link *link);
>  void device_link_remove(void *consumer, struct device *supplier);
> +void device_link_remove_from_wfs(struct device *consumer);
>  
>  #ifndef dev_fmt
>  #define dev_fmt(fmt) fmt
> -- 
> 2.22.0.709.g102302147b-goog
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ