lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241125181231.XpOsxxHx@linutronix.de>
Date: Mon, 25 Nov 2024 19:12:31 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Guenter Roeck <linux@...ck-us.net>
Cc: sparclinux@...r.kernel.org, linux-kernel@...r.kernel.org,
	Boqun Feng <boqun.feng@...il.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>,
	"David S. Miller" <davem@...emloft.net>,
	Andreas Larsson <andreas@...sler.com>
Subject: Re: [PATCH] sparc/pci: Make pci_poke_lock a raw_spinlock_t.

On 2024-11-25 09:59:09 [-0800], Guenter Roeck wrote:
> On 11/25/24 09:43, Sebastian Andrzej Siewior wrote:
> > On 2024-11-25 09:01:33 [-0800], Guenter Roeck wrote:
> > > Unfortunately it doesn't make a difference.
> > 
> > stunning. It looks like the exact same error message.
> > 
> 
> I think it uses
> 
> #define spin_lock_irqsave(lock, flags)                          \
> do {                                                            \
>         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
> } while (0)
> 
> from include/linux/spinlock.h, meaning your patch doesn't really make a difference.

The difference comes from DEFINE_SPINLOCK vs DEFINE_RAW_SPINLOCK. There
is the .lock_type init which goes from LD_WAIT_CONFIG to LD_WAIT_SPIN.
And this is all it matters.

> > > [    1.050499] =============================
> > > [    1.050801] [ BUG: Invalid wait context ]
> > > [    1.051200] 6.12.0+ #1 Not tainted
> > > [    1.051571] -----------------------------
> > > [    1.051875] swapper/0/1 is trying to lock:
> > > [    1.052201] 0000000001b694c8 (pci_poke_lock){....}-{3:3}, at: pci_config_read16+0x8/0x80
> > > [    1.052994] other info that might help us debug this:
> > > [    1.053331] context-{5:5}
> > > [    1.053641] 2 locks held by swapper/0/1:
> > > [    1.053959]  #0: fffff800042b50f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x80/0x160
> > > [    1.054388]  #1: 0000000001d29078 (pci_lock){....}-{2:2}, at: pci_bus_read_config_word+0x18/0x80
> > > [    1.054793] stack backtrace:
> > > [    1.055171] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0+ #1
> > > [    1.055632] Call Trace:
> > > [    1.055985] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
> > > [    1.056329] [<00000000004e63e8>] lock_acquire+0xe8/0x340
> > > [    1.056645] [<00000000010f0dfc>] _raw_spin_lock_irqsave+0x3c/0x80
> > > [    1.056966] [<0000000000443828>] pci_config_read16+0x8/0x80
> > > [    1.057278] [<000000000044442c>] sun4u_read_pci_cfg+0x12c/0x1a0
> > > [    1.057593] [<0000000000b7657c>] pci_bus_read_config_word+0x3c/0x80
> > > [    1.057913] [<0000000000b7fa78>] pci_find_capability+0x18/0xa0
> > > [    1.058228] [<0000000000b794b0>] set_pcie_port_type+0x10/0x160
> > > [    1.058543] [<0000000000442a98>] pci_of_scan_bus+0x158/0xb00
> > > [    1.058854] [<00000000010c74a0>] pci_scan_one_pbm+0xd0/0xf8
> > > [    1.059167] [<0000000000446174>] sabre_probe+0x1f4/0x5c0
> > > [    1.059476] [<0000000000c13a48>] platform_probe+0x28/0x80
> > > [    1.059785] [<0000000000c11158>] really_probe+0xb8/0x340
> > > [    1.060098] [<0000000000c11584>] driver_probe_device+0x24/0xe0
> > > [    1.060413] [<0000000000c117ac>] __driver_attach+0x8c/0x160
> > > [    1.060728] [<0000000000c0ef54>] bus_for_each_dev+0x54/0xc0
> > > 
> > > The original call trace also included _raw_spin_lock_irqsave(), and
> > > I don't have CONFIG_PREEMPT_RT enabled in my sparc64 builds to start with.
> > 
> > You don't have to. "CONFIG_PROVE_RAW_LOCK_NESTING" looks if you try to
> > acquire raw_spinlock_t -> spinlock_t. Which it did before I made the
> > patch.
> > The pci_lock is from drivers/pci/access.c and is defined as
> > raw_spinlock_t. And I made pci_poke_lock of the same time. But debug
> > says 3:3 which suggests LD_WAIT_CONFIG. (No patch applied).
> > 
> > > FWIW, I don't understand the value of
> > > 	pr_warn("context-{%d:%d}\n", curr_inner, curr_inner);
> > > Why print curr_inner twice ?
> > 
> > The syntax was once (or is) inner:outer. If you look from the top, you
> > have 4 (mutex_t) followed pci_lock (the raw_spinlock_t) 2. You are at
> > level 2 now and try to acquire spin_lock_t (3).
> > 
> 
> How does that explain the
> 	context-{5:5}

This is max value based on context. Your context is a simple process.
Not handling an interrupt or anything of this kind.
The culprit is 
|  swapper/0/1 is trying to lock:
|  0000000001b694c8 (pci_poke_lock){....}-{3:3}, at: pci_config_read16+0x8/0x80

where you have pci_poke_lock classified as a 3. The context allows a 5
so based on the context, the 3 would fly. But since pci_lock is a 2 we
have the splat here.

> which is created from the following ?
> 	pr_warn("context-{%d:%d}\n", curr_inner, curr_inner);
> 
> Again, why print curr_inner twice ?

It is the same syntax as in print_lock_name(). Except here, we don't
have an outer type. The difference is RCU because it has a lower type
than a spinlock_t and you can acquire a spinlock_t within an RCU
section and lockdep is fine with it. It comes yelling once you try this
with a mutex_t.

> Thanks,
> Guenter

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ