lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <mkhlikoxwmfzenn27waowo4bovioswok43az64jd6q6qua6d5p@yqqma72erlkl>
Date: Wed, 4 Feb 2026 15:52:12 -0600
From: Bjorn Andersson <andersson@...nel.org>
To: Hans de Goede <johannes.goede@....qualcomm.com>
Cc: Saravana Kannan <saravanak@...nel.org>, Rob Herring <robh@...nel.org>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J . Wysocki" <rafael@...nel.org>, 
	Danilo Krummrich <dakr@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] driver core: Make deferred_probe_timeout default a
 Kconfig option

On Wed, Feb 04, 2026 at 04:00:45PM +0100, Hans de Goede wrote:

Thanks for posting this, Hans. Let's loop in Saravana and Rob as well,
who looked at this subject in the past.

> Code using driver_deferred_probe_check_state() differs from most
> EPROBE_DEFER handling in the kernel. Where other EPROBE_DEFER handling
> (e.g. clks, gpios and regulators) waits indefinitely for suppliers to
> show up, code using driver_deferred_probe_check_state() will fail
> after the deferred_probe_timeout.
> 
> This is a problem for generic distro kernels which want to support many
> boards using a single kernel build. These kernels want as much drivers to
> be modular as possible. The initrd also should be as small as possible,
> so the initrd will *not* have drivers not needing to get the rootfs.
> 

This problem manifests itself in the upstream kernel, for upstream
developers as well.

On some platforms we have intermittent boot failures even when testing
with a minimal ramdisk (with kernel modules overlaid), because of the
non-deterministic module loading order it might take time before we get
the providers lined up.

Another concrete issue is that the Qualcomm CPUfreq driver, while
builtin, on many targets has dependencies on drivers that we today mark
as modules. So with a decently sized ramdisk we don't have time to
unpack the ramdisk before things start breaking.


The typical symptom I see when this happens is that the SMMU fails to
find its power-domain provider, in some cases the result is
non-functional system, but often the hardware state ends up such that
the board resets...

> Combine this with waiting for a full-disk encryption password in
> the initrd and it is pretty much guaranteed that the default 10s timeout
> will be hit, causing probe() failures when drivers on the rootfs happen
> to get modprobe-d before other rootfs modules providing their suppliers.
> 

Indeed, LUKS is a challenge, performing any form of debugging of what
kernel modules you forgot to inject into your ramdisk is impossible.

> Make the default timeout configurable from Kconfig to allow distro kernel
> configs where many of the supplier drivers are modules to set the default
> through Kconfig and allow using a value of -1 to disable the timeout
> (wait indefinitely).
> 

The timeout mechanism was introduced to handle those exceptional cases
where distro-kernels are missing specific provider drivers but still
want to roll the dice and try to reach a functional user space to allow
the user to correct the issue.

There's clearly many situations where that will not work in today's
kernel - and as we evolve sync_state, this problem is going to grow.


I therefor would, once again, like to see the default value to be "no
timeout". We can keep the option for the user to opt-in to the
alternative (riskier) path. For this the command line option would
suffice, but with a new default.


The added Kconfig option of course would allow distributions to set the
default to -1, but I'd prefer to provide a sane default value.

Regards,
Bjorn

> Signed-off-by: Hans de Goede <johannes.goede@....qualcomm.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 2 +-
>  drivers/base/Kconfig                            | 9 +++++++++
>  drivers/base/dd.c                               | 9 ++++-----
>  3 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 1058f2a6d6a8..80d300c4e16b 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1250,7 +1250,7 @@ Kernel parameters
>  			out hasn't expired, it'll be restarted by each
>  			successful driver registration. This option will also
>  			dump out devices still on the deferred probe list after
> -			retrying.
> +			retrying. Set to -1 to wait indefinitely.
>  
>  	delayacct	[KNL] Enable per-task delay accounting
>  
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 1786d87b29e2..f7d385cbd3ba 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -73,6 +73,15 @@ config DEVTMPFS_SAFE
>  	  with the PROT_EXEC flag. This can break, for example, non-KMS
>  	  video drivers.
>  
> +config DRIVER_DEFERRED_PROBE_TIMEOUT
> +	int "Default value for deferred_probe_timeout"
> +	default 0 if !MODULES
> +	default 10 if MODULES
> +	help
> +	  Set the default value for the deferred_probe_timeout kernel parameter.
> +	  See Documentation/admin-guide/kernel-parameters.txt for a description
> +	  of the deferred_probe_timeout kernel parameter.
> +
>  config STANDALONE
>  	bool "Select only drivers that don't need compile-time external firmware"
>  	default y
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index bea8da5f8a3a..e57144aa168d 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -257,11 +257,7 @@ static int deferred_devs_show(struct seq_file *s, void *data)
>  }
>  DEFINE_SHOW_ATTRIBUTE(deferred_devs);
>  
> -#ifdef CONFIG_MODULES
> -static int driver_deferred_probe_timeout = 10;
> -#else
> -static int driver_deferred_probe_timeout;
> -#endif
> +static int driver_deferred_probe_timeout = CONFIG_DRIVER_DEFERRED_PROBE_TIMEOUT;
>  
>  static int __init deferred_probe_timeout_setup(char *str)
>  {
> @@ -323,6 +319,9 @@ static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_
>  
>  void deferred_probe_extend_timeout(void)
>  {
> +	if (driver_deferred_probe_timeout < 0)
> +		return;
> +
>  	/*
>  	 * If the work hasn't been queued yet or if the work expired, don't
>  	 * start a new one.
> -- 
> 2.52.0
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ