lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsPvqBfL5hQDOARwfqasLRJ_eNPBbCngZ257HOe=xbWDkA@mail.gmail.com>
Date: Tue, 23 Jul 2024 00:36:18 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: Jonathan.Cameron@...wei.com, rafael.j.wysocki@...el.com, 
	guohanjun@...wei.com, gshan@...hat.com, miguel.luis@...cle.com, 
	catalin.marinas@....com, 
	Linux List Kernel Mailing <linux-kernel@...r.kernel.org>, 
	Linux regressions mailing list <regressions@...ts.linux.dev>
Subject: 6.11/regression/bisected - The commit c1385c1f0ba3 caused a new
 possible recursive locking detected warning at computer boot.

Hi,
The first Fedora update to the 6.11 kernel
(kernel-debug-6.11.0-0.rc0.20240716gitd67978318827.2.fc41.x86_64)
brings a new warning: possible recursive locking detected.
The trace looks like:
ACPI: button: Power Button [PWRF]

============================================
WARNING: possible recursive locking detected
6.11.0-0.rc0.20240716gitd67978318827.2.fc41.x86_64+debug #1 Not tainted
--------------------------------------------
cpuhp/0/22 is trying to acquire lock:
ffffffffb7f9cb40 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0x12/0x20

but task is already holding lock:
ffffffffb7f9cb40 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0xcd/0x6f0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(cpu_hotplug_lock);
  lock(cpu_hotplug_lock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by cpuhp/0/22:
 #0: ffffffffb7f9cb40 (cpu_hotplug_lock){++++}-{0:0}, at:
cpuhp_thread_fun+0xcd/0x6f0
 #1: ffffffffb7f9f2e0 (cpuhp_state-up){+.+.}-{0:0}, at:
cpuhp_thread_fun+0xcd/0x6f0
 #2: ffffffffb7f1d650 (freq_invariance_lock){+.+.}-{3:3}, at:
init_freq_invariance_cppc+0xf4/0x1e0

stack backtrace:
CPU: 0 PID: 22 Comm: cpuhp/0 Not tainted
6.11.0-0.rc0.20240716gitd67978318827.2.fc41.x86_64+debug #1
Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI,
BIOS 2611 04/07/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x84/0xd0
 __lock_acquire+0x27e3/0x5c70
 ? __pfx___lock_acquire+0x10/0x10
 ? cppc_get_perf_caps+0x64f/0xf60
 lock_acquire+0x1ae/0x540
 ? static_key_enable+0x12/0x20
 ? __pfx_lock_acquire+0x10/0x10
 ? __pfx___might_resched+0x10/0x10
 cpus_read_lock+0x40/0xe0
 ? static_key_enable+0x12/0x20
 static_key_enable+0x12/0x20
 freq_invariance_enable+0x13/0x40
 init_freq_invariance_cppc+0x17e/0x1e0
 ? __pfx_init_freq_invariance_cppc+0x10/0x10
 ? acpi_cppc_processor_probe+0x1046/0x2300
 acpi_cppc_processor_probe+0x11ae/0x2300
 ? _raw_spin_unlock_irqrestore+0x4f/0x80
 ? __pfx_acpi_cppc_processor_probe+0x10/0x10
 ? __pfx_acpi_scan_drop_device+0x10/0x10
 ? acpi_fetch_acpi_dev+0x79/0xe0
 ? __pfx_acpi_fetch_acpi_dev+0x10/0x10
 ? __pfx_acpi_soft_cpu_online+0x10/0x10
 acpi_soft_cpu_online+0x114/0x330
 cpuhp_invoke_callback+0x2c7/0xa40
 ? __pfx_lock_release+0x10/0x10
 ? __pfx_lock_release+0x10/0x10
 ? cpuhp_thread_fun+0xcd/0x6f0
 cpuhp_thread_fun+0x33a/0x6f0
 ? smpboot_thread_fn+0x56/0x930
 smpboot_thread_fn+0x54b/0x930
 ? __pfx_smpboot_thread_fn+0x10/0x10
 ? __pfx_smpboot_thread_fn+0x10/0x10
 kthread+0x2d2/0x3a0
 ? _raw_spin_unlock_irq+0x28/0x60
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x31/0x70
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Bisect is pointed to commit
commit c1385c1f0ba3b80bd12f26c440612175088c664c (HEAD)
Author: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Date:   Wed May 29 14:34:28 2024 +0100

    ACPI: processor: Simplify initial onlining to use same path for
cold and hotplug

    Separate code paths, combined with a flag set in acpi_processor.c to
    indicate a struct acpi_processor was for a hotplugged CPU ensured that
    per CPU data was only set up the first time that a CPU was initialized.
    This appears to be unnecessary as the paths can be combined by letting
    the online logic also handle any CPUs online at the time of driver load.

    Motivation for this change, beyond simplification, is that ARM64
    virtual CPU HP uses the same code paths for hotplug and cold path in
    acpi_processor.c so had no easy way to set the flag for hotplug only.
    Removing this necessity will enable ARM64 vCPU HP to reuse the existing
    code paths.

    Acked-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
    Reviewed-by: Hanjun Guo <guohanjun@...wei.com>
    Tested-by: Miguel Luis <miguel.luis@...cle.com>
    Reviewed-by: Gavin Shan <gshan@...hat.com>
    Reviewed-by: Miguel Luis <miguel.luis@...cle.com>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
    Link: https://lore.kernel.org/r/20240529133446.28446-2-Jonathan.Cameron@huawei.com
    Signed-off-by: Catalin Marinas <catalin.marinas@....com>

 drivers/acpi/acpi_processor.c   |  7 +++----
 drivers/acpi/processor_driver.c | 43
++++++++++++-------------------------------
 include/acpi/processor.h        |  2 +-
 3 files changed, 16 insertions(+), 36 deletions(-)

And I can confirm that after reverting c1385c1f0ba3 the issue is gone.

I also attach here a full kernel log and build config.

My hardware specs: https://linux-hardware.org/?probe=c6de14f5b8

Jonathan, can you look into this, please?

-- 
Best Regards,
Mike Gavrilov.

Download attachment "6.11.0-0.rc0.20240716gitd67978318827.2.fc41.x86_64+debug.zip" of type "application/zip" (53007 bytes)

Download attachment ".config.zip" of type "application/zip" (66694 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ