lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACRMN=eyiet0sDK+wXLFpT5GcPDrcZS_rC2mgoED2J5fnRbu+Q@mail.gmail.com>
Date: Sat, 7 Feb 2026 17:06:08 -0800
From: Saravana Kannan <saravanak@...nel.org>
To: Hans de Goede <johannes.goede@....qualcomm.com>
Cc: Bjorn Andersson <andersson@...nel.org>, Saravana Kannan <saravanak@...nel.org>, 
	Rob Herring <robh@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, 
	"Rafael J . Wysocki" <rafael@...nel.org>, Danilo Krummrich <dakr@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] driver core: Make deferred_probe_timeout default a
 Kconfig option

On Thu, Feb 5, 2026 at 1:17 AM Hans de Goede
<johannes.goede@....qualcomm.com> wrote:
>
> Hi Bjorn,
>
> Thank you for your comments.
>
> On 4-Feb-26 22:52, Bjorn Andersson wrote:
> > On Wed, Feb 04, 2026 at 04:00:45PM +0100, Hans de Goede wrote:
> >
> > Thanks for posting this, Hans. Let's loop in Saravana and Rob as well,
> > who looked at this subject in the past.

Thanks for looping me in Bjorn.

> >
> >> Code using driver_deferred_probe_check_state() differs from most
> >> EPROBE_DEFER handling in the kernel. Where other EPROBE_DEFER handling
> >> (e.g. clks, gpios and regulators) waits indefinitely for suppliers to
> >> show up, code using driver_deferred_probe_check_state() will fail
> >> after the deferred_probe_timeout.
> >>
> >> This is a problem for generic distro kernels which want to support many
> >> boards using a single kernel build. These kernels want as much drivers to
> >> be modular as possible. The initrd also should be as small as possible,
> >> so the initrd will *not* have drivers not needing to get the rootfs.
> >>
> >
> > This problem manifests itself in the upstream kernel, for upstream
> > developers as well.
> >
> > On some platforms we have intermittent boot failures even when testing
> > with a minimal ramdisk (with kernel modules overlaid), because of the
> > non-deterministic module loading order it might take time before we get
> > the providers lined up.
> >
> > Another concrete issue is that the Qualcomm CPUfreq driver, while
> > builtin, on many targets has dependencies on drivers that we today mark
> > as modules. So with a decently sized ramdisk we don't have time to
> > unpack the ramdisk before things start breaking.
> >
> >
> > The typical symptom I see when this happens is that the SMMU fails to
> > find its power-domain provider, in some cases the result is
> > non-functional system, but often the hardware state ends up such that
> > the board resets...
> >
> >> Combine this with waiting for a full-disk encryption password in
> >> the initrd and it is pretty much guaranteed that the default 10s timeout
> >> will be hit, causing probe() failures when drivers on the rootfs happen
> >> to get modprobe-d before other rootfs modules providing their suppliers.
> >>
> >
> > Indeed, LUKS is a challenge, performing any form of debugging of what
> > kernel modules you forgot to inject into your ramdisk is impossible.
> >
> >> Make the default timeout configurable from Kconfig to allow distro kernel
> >> configs where many of the supplier drivers are modules to set the default
> >> through Kconfig and allow using a value of -1 to disable the timeout
> >> (wait indefinitely).
> >>
> >
> > The timeout mechanism was introduced to handle those exceptional cases
> > where distro-kernels are missing specific provider drivers but still
> > want to roll the dice and try to reach a functional user space to allow
> > the user to correct the issue.
> >
> > There's clearly many situations where that will not work in today's
> > kernel - and as we evolve sync_state, this problem is going to grow.
> >
> >
> > I therefor would, once again, like to see the default value to be "no
> > timeout". We can keep the option for the user to opt-in to the
> > alternative (riskier) path. For this the command line option would
> > suffice, but with a new default.
> >
> >
> > The added Kconfig option of course would allow distributions to set the
> > default to -1, but I'd prefer to provide a sane default value.
>
> AFAICT when this was discussed before opinions on this were divided.

Yeah, IMHO, no timeout should be the default value. But as Hans
mentioned, getting a consensus on that seems impossible.

>
> Which is why I've chosen to just make the default configurable so
> that distros/people can chose.

I'm okay with this patch. I was going to say that you don't need a
separate -1 and we already have an option to wait indefinitely. But
looks like you figured that out yourself (but didn't cc me). So, if
you drop the changes to the doc and fix fix up the code to just use
the config value (and not touch the rest of the code), then:

Acked-by: Saravana Kannan <saravanak@...nel.org>

-Saravana

>
> I'm not necessarily against making -1 the default, but I think that
> might be a hard to sell to some people.
>
> Note that if this lands you can always make the default -1 for
> qcom specific defconfigs.
>
> Regards,
>
> Hans
>
>
>
>
> >> Signed-off-by: Hans de Goede <johannes.goede@....qualcomm.com>
> >> ---
> >>  Documentation/admin-guide/kernel-parameters.txt | 2 +-
> >>  drivers/base/Kconfig                            | 9 +++++++++
> >>  drivers/base/dd.c                               | 9 ++++-----
> >>  3 files changed, 14 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> >> index 1058f2a6d6a8..80d300c4e16b 100644
> >> --- a/Documentation/admin-guide/kernel-parameters.txt
> >> +++ b/Documentation/admin-guide/kernel-parameters.txt
> >> @@ -1250,7 +1250,7 @@ Kernel parameters
> >>                      out hasn't expired, it'll be restarted by each
> >>                      successful driver registration. This option will also
> >>                      dump out devices still on the deferred probe list after
> >> -                    retrying.
> >> +                    retrying. Set to -1 to wait indefinitely.
> >>
> >>      delayacct       [KNL] Enable per-task delay accounting
> >>
> >> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> >> index 1786d87b29e2..f7d385cbd3ba 100644
> >> --- a/drivers/base/Kconfig
> >> +++ b/drivers/base/Kconfig
> >> @@ -73,6 +73,15 @@ config DEVTMPFS_SAFE
> >>        with the PROT_EXEC flag. This can break, for example, non-KMS
> >>        video drivers.
> >>
> >> +config DRIVER_DEFERRED_PROBE_TIMEOUT
> >> +    int "Default value for deferred_probe_timeout"
> >> +    default 0 if !MODULES
> >> +    default 10 if MODULES
> >> +    help
> >> +      Set the default value for the deferred_probe_timeout kernel parameter.
> >> +      See Documentation/admin-guide/kernel-parameters.txt for a description
> >> +      of the deferred_probe_timeout kernel parameter.
> >> +
> >>  config STANDALONE
> >>      bool "Select only drivers that don't need compile-time external firmware"
> >>      default y
> >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> >> index bea8da5f8a3a..e57144aa168d 100644
> >> --- a/drivers/base/dd.c
> >> +++ b/drivers/base/dd.c
> >> @@ -257,11 +257,7 @@ static int deferred_devs_show(struct seq_file *s, void *data)
> >>  }
> >>  DEFINE_SHOW_ATTRIBUTE(deferred_devs);
> >>
> >> -#ifdef CONFIG_MODULES
> >> -static int driver_deferred_probe_timeout = 10;
> >> -#else
> >> -static int driver_deferred_probe_timeout;
> >> -#endif
> >> +static int driver_deferred_probe_timeout = CONFIG_DRIVER_DEFERRED_PROBE_TIMEOUT;
> >>
> >>  static int __init deferred_probe_timeout_setup(char *str)
> >>  {
> >> @@ -323,6 +319,9 @@ static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_
> >>
> >>  void deferred_probe_extend_timeout(void)
> >>  {
> >> +    if (driver_deferred_probe_timeout < 0)
> >> +            return;
> >> +
> >>      /*
> >>       * If the work hasn't been queued yet or if the work expired, don't
> >>       * start a new one.
> >> --
> >> 2.52.0
> >>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ