lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Aug 2023 15:09:27 +0530
From:   Shreeya Patel <shreeya.patel@...labora.com>
To:     saravanak@...gle.com,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     stable@...r.kernel.org, John Stultz <jstultz@...gle.com>,
        "David S. Miller" <davem@...emloft.net>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Rob Herring <robh@...nel.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Yoshihiro Shimoda <yoshihiro.shimoda.uh@...esas.com>,
        Robin Murphy <robin.murphy@....com>,
        Andy Shevchenko <andy.shevchenko@...il.com>,
        Sudeep Holla <sudeep.holla@....com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Naresh Kamboju <naresh.kamboju@...aro.org>,
        Basil Eljuse <Basil.Eljuse@....com>,
        Ferry Toth <fntoth@...il.com>, Arnd Bergmann <arnd@...db.de>,
        Anders Roxell <anders.roxell@...aro.org>,
        linux-pm@...r.kernel.org, Nathan Chancellor <nathan@...nel.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Geert Uytterhoeven <geert+renesas@...der.be>,
        Saravana Kannan <saravanak@...gle.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org,
        "gustavo.padovan@...labora.com" <gustavo.padovan@...labora.com>,
        Ricardo CaƱuelo Navarro 
        <ricardo.canuelo@...labora.com>,
        Guillaume Charles Tucker <guillaume.tucker@...labora.com>,
        usama.anjum@...labora.com, kernelci@...ts.linux.dev
Subject: Re: [PATCH 5.17 127/298] driver core: Fix wait_for_device_probe() &
 deferred_probe_timeout interaction

On 13/06/22 15:40, Greg Kroah-Hartman wrote:
> From: Saravana Kannan<saravanak@...gle.com>
>
> [ Upstream commit 5ee76c256e928455212ab759c51d198fedbe7523 ]
>
> Mounting NFS rootfs was timing out when deferred_probe_timeout was
> non-zero [1].  This was because ip_auto_config() initcall times out
> waiting for the network interfaces to show up when
> deferred_probe_timeout was non-zero. While ip_auto_config() calls
> wait_for_device_probe() to make sure any currently running deferred
> probe work or asynchronous probe finishes, that wasn't sufficient to
> account for devices being deferred until deferred_probe_timeout.
>
> Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits
> until the deferred_probe_timeout fires") tried to fix that by making
> sure wait_for_device_probe() waits for deferred_probe_timeout to expire
> before returning.
>
> However, if wait_for_device_probe() is called from the kernel_init()
> context:
>
> - Before deferred_probe_initcall() [2], it causes the boot process to
>    hang due to a deadlock.
>
> - After deferred_probe_initcall() [3], it blocks kernel_init() from
>    continuing till deferred_probe_timeout expires and beats the point of
>    deferred_probe_timeout that's trying to wait for userspace to load
>    modules.
>
> Neither of this is good. So revert the changes to
> wait_for_device_probe().
>
> [1] -https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
> [2] -https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
> [3] -https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/

Hi Saravana, Greg,


KernelCI found this patch causes the baseline.bootrr.deferred-probe-empty test to fail on r8a77960-ulcb,
see the following details for more information.

KernelCI dashboard link:
https://linux.kernelci.org/test/plan/id/64d2a6be8c1a8435e535b264/

Error messages from the logs :-

+ UUID=11236495_1.5.2.4.5
+ set +x
+ export 'PATH=/opt/bootrr/libexec/bootrr/helpers:/lava-11236495/1/../bin:/sbin:/usr/sbin:/bin:/usr/bin'
+ cd /opt/bootrr/libexec/bootrr
+ sh helpers/bootrr-auto
e6800000.ethernet	
e6700000.dma-controller	
e7300000.dma-controller	
e7310000.dma-controller	
ec700000.dma-controller	
ec720000.dma-controller	
fea20000.vsp	
feb00000.display	
fea28000.vsp	
fea30000.vsp	
fe9a0000.vsp	
fe9af000.fcp	
fea27000.fcp	
fea2f000.fcp	
fea37000.fcp	
sound	
ee100000.mmc	
ee140000.mmc	
ec500000.sound	
/lava-11236495/1/../bin/lava-test-case
<8>[   17.476741] <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=deferred-probe-empty RESULT=fail>

Test case failing :-
Baseline Bootrr deferred-probe-empty test -https://github.com/kernelci/bootrr/blob/main/helpers/bootrr-generic-tests

Regression Reproduced :-

Lava job after reverting the commit 5ee76c256e92
https://lava.collabora.dev/scheduler/job/11292890


Bisection report from KernelCI can be found at the bottom of the email.

Thanks,
Shreeya Patel

#regzbot introduced: 5ee76c256e92
#regzbot title: KernelCI: Multiple devices deferring on r8a77960-ulcb

---------------------------------------------------------------------------------------------------------------------------------------------------

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * **
* If you do send a fix, please include this trailer: *
* Reported-by: "kernelci.org bot" <bot@...> *
* *
* Hope this helps! *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

stable-rc/linux-5.10.y bisection: baseline.bootrr.deferred-probe-empty 
on r8a77960-ulcb

Summary:
Start: 686c84f2f136 Linux 5.10.189-rc1
Plain log: 
https://storage.kernelci.org/stable-rc/linux-5.10.y/v5.10.188-183-g686c84f2f1364/arm64/defconfig/gcc-10/lab-collabora/baseline-r8a77960-ulcb.txt
HTML log: 
https://storage.kernelci.org/stable-rc/linux-5.10.y/v5.10.188-183-g686c84f2f1364/arm64/defconfig/gcc-10/lab-collabora/baseline-r8a77960-ulcb.html
Result: 71cbce75031a driver core: Fix wait_for_device_probe() & 
deferred_probe_timeout interaction

Checks:
revert: PASS
verify: PASS

Parameters:
Tree: stable-rc
URL: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Branch: linux-5.10.y
Target: r8a77960-ulcb
CPU arch: arm64
Lab: lab-collabora
Compiler: gcc-10
Config: defconfig
Test case: baseline.bootrr.deferred-probe-empty

Breaking commit found:

-------------------------------------------------------------------------------
commit 71cbce75031aed26c72c2dc8a83111d181685f1b
Author: Saravana Kannan <saravanak@...>
Date: Fri Jun 3 13:31:37 2022 +0200

driver core: Fix wait_for_device_probe() & deferred_probe_timeout 
interaction

[ Upstream commit 5ee76c256e928455212ab759c51d198fedbe7523 ]

Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1]. This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.

Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.

However, if wait_for_device_probe() is called from the kernel_init()
context:

- Before deferred_probe_initcall() [2], it causes the boot process to
hang due to a deadlock.

- After deferred_probe_initcall() [3], it blocks kernel_init() from
continuing till deferred_probe_timeout expires and beats the point of
deferred_probe_timeout that's trying to wait for userspace to load
modules.

Neither of this is good. So revert the changes to
wait_for_device_probe().

[1] - 
https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/

Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits 
until the deferred_probe_timeout fires")
Cc: John Stultz <jstultz@...>
Cc: "David S. Miller" <davem@...>
Cc: Alexey Kuznetsov <kuznet@...>
Cc: Hideaki YOSHIFUJI <yoshfuji@...>
Cc: Jakub Kicinski <kuba@...>
Cc: Rob Herring <robh@...>
Cc: Geert Uytterhoeven <geert@...>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@...>
Cc: Robin Murphy <robin.murphy@...>
Cc: Andy Shevchenko <andy.shevchenko@...>
Cc: Sudeep Holla <sudeep.holla@...>
Cc: Andy Shevchenko <andriy.shevchenko@...>
Cc: Naresh Kamboju <naresh.kamboju@...>
Cc: Basil Eljuse <Basil.Eljuse@...>
Cc: Ferry Toth <fntoth@...>
Cc: Arnd Bergmann <arnd@...>
Cc: Anders Roxell <anders.roxell@...>
Cc: linux-pm@...
Reported-by: Nathan Chancellor <nathan@...>
Reported-by: Sebastian Andrzej Siewior <bigeasy@...>
Tested-by: Geert Uytterhoeven <geert+renesas@...>
Acked-by: John Stultz <jstultz@...>
Signed-off-by: Saravana Kannan <saravanak@...>
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@...>
Reviewed-by: Rafael J. Wysocki <rafael@...>
Signed-off-by: Linus Torvalds <torvalds@...>
Signed-off-by: Sasha Levin <sashal@...>

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 4f4e8aedbd2c..f9d9f1ad9215 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -250,7 +250,6 @@ DEFINE_SHOW_ATTRIBUTE(deferred_devs);

int driver_deferred_probe_timeout;
EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);
-static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue);

static int __init deferred_probe_timeout_setup(char *str)
{
@@ -302,7 +301,6 @@ static void deferred_probe_timeout_work_func(struct 
work_struct *work)
list_for_each_entry(p, &deferred_probe_pending_list, deferred_probe)
dev_info(p->device, "deferred probe pending\n");
mutex_unlock(&deferred_probe_mutex);
- wake_up_all(&probe_timeout_waitqueue);
}
static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, 
deferred_probe_timeout_work_func);

@@ -706,9 +704,6 @@ int driver_probe_done(void)
*/
void wait_for_device_probe(void)
{
- /* wait for probe timeout */
- wait_event(probe_timeout_waitqueue, !driver_deferred_probe_timeout);
-
/* wait for the deferred probe workqueue to finish */
flush_work(&deferred_probe_work);
-------------------------------------------------------------------------------


Git bisection log:

-------------------------------------------------------------------------------
git bisect start
# good: [2c85ebc57b3e1817b6ce1a6b703928e113a90442] Linux 5.10
git bisect good 2c85ebc57b3e1817b6ce1a6b703928e113a90442
# bad: [686c84f2f136412631eb684b064def993a96a8cc] Linux 5.10.189-rc1
git bisect bad 686c84f2f136412631eb684b064def993a96a8cc
# good: [88f1b613c37fbd3c4171f5a9decdcd12ae704637] Bluetooth: cmtp: fix 
possible panic when cmtp_init_sockets() fails
git bisect good 88f1b613c37fbd3c4171f5a9decdcd12ae704637
# bad: [6c5742372b2d5d36de129439e26eda05aab54652] Input: snvs_pwrkey - 
fix SNVS_HPVIDR1 register address
git bisect bad 6c5742372b2d5d36de129439e26eda05aab54652
# good: [07280d2c3f33d47741f42411eb8c976b70c6657a] random: make more 
consistent use of integer types
git bisect good 07280d2c3f33d47741f42411eb8c976b70c6657a
# bad: [2fc7f18ba2f98d15f174ce8e25a5afa46926eb55] tools headers: Remove 
broken definition of __LITTLE_ENDIAN
git bisect bad 2fc7f18ba2f98d15f174ce8e25a5afa46926eb55
# bad: [c2ae49a113a5344232f1ebb93bcf18bbd11e9c39] net: dsa: 
lantiq_gswip: Fix refcount leak in gswip_gphy_fw_list
git bisect bad c2ae49a113a5344232f1ebb93bcf18bbd11e9c39
# good: [c1b08aa568e829b743affe5d3231e6de28b7609e] ASoC: samsung: Use 
dev_err_probe() helper
git bisect good c1b08aa568e829b743affe5d3231e6de28b7609e
# good: [97a9ec86ccb4e336ecde46db42b59b2ff7e0d719] drm/nouveau/clk: Fix 
an incorrect NULL check on list iterator
git bisect good 97a9ec86ccb4e336ecde46db42b59b2ff7e0d719
# good: [572211d631d7665c6690b5a6cb80436f8c368dc1] pwm: lp3943: Fix duty 
calculation in case period was clamped
git bisect good 572211d631d7665c6690b5a6cb80436f8c368dc1
# good: [8f49e1694cbc29e76d5028267c1978cc2630e494] bpf: Fix probe read 
error in ___bpf_prog_run()
git bisect good 8f49e1694cbc29e76d5028267c1978cc2630e494
# bad: [3660db29b0305f9a1d95979c7af0f5db6ea99f5d] iommu/arm-smmu: fix 
possible null-ptr-deref in arm_smmu_device_probe()
git bisect bad 3660db29b0305f9a1d95979c7af0f5db6ea99f5d
# good: [04622d631826ba483ae3a0b8a71c745d8e21453d] gpio: pca953x: use 
the correct register address to do regcache sync
git bisect good 04622d631826ba483ae3a0b8a71c745d8e21453d
# bad: [32be2b805a1a13ccc68bd209ec3ae198dd3ba5d6] perf c2c: Fix sorting 
in percent_rmt_hitm_cmp()
git bisect bad 32be2b805a1a13ccc68bd209ec3ae198dd3ba5d6
# good: [c1f0187025905e9981000d44a92e159468b561a8] scsi: sd: Fix 
potential NULL pointer dereference
git bisect good c1f0187025905e9981000d44a92e159468b561a8
# bad: [71cbce75031aed26c72c2dc8a83111d181685f1b] driver core: Fix 
wait_for_device_probe() & deferred_probe_timeout interaction
git bisect bad 71cbce75031aed26c72c2dc8a83111d181685f1b
# good: [b8fac8e321044a9ac50f7185b4e9d91a7745e4b0] tipc: check attribute 
length for bearer name
git bisect good b8fac8e321044a9ac50f7185b4e9d91a7745e4b0
# first bad commit: [71cbce75031aed26c72c2dc8a83111d181685f1b] driver 
core: Fix wait_for_device_probe() & deferred_probe_timeout interaction
-------------------------------------------------------------------------------


> Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
> Cc: John Stultz<jstultz@...gle.com>
> Cc: "David S. Miller"<davem@...emloft.net>
> Cc: Alexey Kuznetsov<kuznet@....inr.ac.ru>
> Cc: Hideaki YOSHIFUJI<yoshfuji@...ux-ipv6.org>
> Cc: Jakub Kicinski<kuba@...nel.org>
> Cc: Rob Herring<robh@...nel.org>
> Cc: Geert Uytterhoeven<geert@...ux-m68k.org>
> Cc: Yoshihiro Shimoda<yoshihiro.shimoda.uh@...esas.com>
> Cc: Robin Murphy<robin.murphy@....com>
> Cc: Andy Shevchenko<andy.shevchenko@...il.com>
> Cc: Sudeep Holla<sudeep.holla@....com>
> Cc: Andy Shevchenko<andriy.shevchenko@...ux.intel.com>
> Cc: Naresh Kamboju<naresh.kamboju@...aro.org>
> Cc: Basil Eljuse<Basil.Eljuse@....com>
> Cc: Ferry Toth<fntoth@...il.com>
> Cc: Arnd Bergmann<arnd@...db.de>
> Cc: Anders Roxell<anders.roxell@...aro.org>
> Cc:linux-pm@...r.kernel.org
> Reported-by: Nathan Chancellor<nathan@...nel.org>
> Reported-by: Sebastian Andrzej Siewior<bigeasy@...utronix.de>
> Tested-by: Geert Uytterhoeven<geert+renesas@...der.be>
> Acked-by: John Stultz<jstultz@...gle.com>
> Signed-off-by: Saravana Kannan<saravanak@...gle.com>
> Link:https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
> Signed-off-by: Greg Kroah-Hartman<gregkh@...uxfoundation.org>
> Reviewed-by: Rafael J. Wysocki<rafael@...nel.org>
> Signed-off-by: Linus Torvalds<torvalds@...ux-foundation.org>
> Signed-off-by: Sasha Levin<sashal@...nel.org>
> ---
>   drivers/base/dd.c | 5 -----
>   1 file changed, 5 deletions(-)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 977e94cf669e..86fd2ea35656 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -257,7 +257,6 @@ DEFINE_SHOW_ATTRIBUTE(deferred_devs);
>   
>   int driver_deferred_probe_timeout;
>   EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);
> -static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue);
>   
>   static int __init deferred_probe_timeout_setup(char *str)
>   {
> @@ -312,7 +311,6 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
>   	list_for_each_entry(p, &deferred_probe_pending_list, deferred_probe)
>   		dev_info(p->device, "deferred probe pending\n");
>   	mutex_unlock(&deferred_probe_mutex);
> -	wake_up_all(&probe_timeout_waitqueue);
>   }
>   static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
>   
> @@ -720,9 +718,6 @@ int driver_probe_done(void)
>    */
>   void wait_for_device_probe(void)
>   {
> -	/* wait for probe timeout */
> -	wait_event(probe_timeout_waitqueue, !driver_deferred_probe_timeout);
> -
>   	/* wait for the deferred probe workqueue to finish */
>   	flush_work(&deferred_probe_work);
>   

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ