[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241125181231.XpOsxxHx@linutronix.de>
Date: Mon, 25 Nov 2024 19:12:31 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Guenter Roeck <linux@...ck-us.net>
Cc: sparclinux@...r.kernel.org, linux-kernel@...r.kernel.org,
Boqun Feng <boqun.feng@...il.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>
Subject: Re: [PATCH] sparc/pci: Make pci_poke_lock a raw_spinlock_t.
On 2024-11-25 09:59:09 [-0800], Guenter Roeck wrote:
> On 11/25/24 09:43, Sebastian Andrzej Siewior wrote:
> > On 2024-11-25 09:01:33 [-0800], Guenter Roeck wrote:
> > > Unfortunately it doesn't make a difference.
> >
> > stunning. It looks like the exact same error message.
> >
>
> I think it uses
>
> #define spin_lock_irqsave(lock, flags) \
> do { \
> raw_spin_lock_irqsave(spinlock_check(lock), flags); \
> } while (0)
>
> from include/linux/spinlock.h, meaning your patch doesn't really make a difference.
The difference comes from DEFINE_SPINLOCK vs DEFINE_RAW_SPINLOCK. There
is the .lock_type init which goes from LD_WAIT_CONFIG to LD_WAIT_SPIN.
And this is all it matters.
> > > [ 1.050499] =============================
> > > [ 1.050801] [ BUG: Invalid wait context ]
> > > [ 1.051200] 6.12.0+ #1 Not tainted
> > > [ 1.051571] -----------------------------
> > > [ 1.051875] swapper/0/1 is trying to lock:
> > > [ 1.052201] 0000000001b694c8 (pci_poke_lock){....}-{3:3}, at: pci_config_read16+0x8/0x80
> > > [ 1.052994] other info that might help us debug this:
> > > [ 1.053331] context-{5:5}
> > > [ 1.053641] 2 locks held by swapper/0/1:
> > > [ 1.053959] #0: fffff800042b50f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x80/0x160
> > > [ 1.054388] #1: 0000000001d29078 (pci_lock){....}-{2:2}, at: pci_bus_read_config_word+0x18/0x80
> > > [ 1.054793] stack backtrace:
> > > [ 1.055171] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0+ #1
> > > [ 1.055632] Call Trace:
> > > [ 1.055985] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
> > > [ 1.056329] [<00000000004e63e8>] lock_acquire+0xe8/0x340
> > > [ 1.056645] [<00000000010f0dfc>] _raw_spin_lock_irqsave+0x3c/0x80
> > > [ 1.056966] [<0000000000443828>] pci_config_read16+0x8/0x80
> > > [ 1.057278] [<000000000044442c>] sun4u_read_pci_cfg+0x12c/0x1a0
> > > [ 1.057593] [<0000000000b7657c>] pci_bus_read_config_word+0x3c/0x80
> > > [ 1.057913] [<0000000000b7fa78>] pci_find_capability+0x18/0xa0
> > > [ 1.058228] [<0000000000b794b0>] set_pcie_port_type+0x10/0x160
> > > [ 1.058543] [<0000000000442a98>] pci_of_scan_bus+0x158/0xb00
> > > [ 1.058854] [<00000000010c74a0>] pci_scan_one_pbm+0xd0/0xf8
> > > [ 1.059167] [<0000000000446174>] sabre_probe+0x1f4/0x5c0
> > > [ 1.059476] [<0000000000c13a48>] platform_probe+0x28/0x80
> > > [ 1.059785] [<0000000000c11158>] really_probe+0xb8/0x340
> > > [ 1.060098] [<0000000000c11584>] driver_probe_device+0x24/0xe0
> > > [ 1.060413] [<0000000000c117ac>] __driver_attach+0x8c/0x160
> > > [ 1.060728] [<0000000000c0ef54>] bus_for_each_dev+0x54/0xc0
> > >
> > > The original call trace also included _raw_spin_lock_irqsave(), and
> > > I don't have CONFIG_PREEMPT_RT enabled in my sparc64 builds to start with.
> >
> > You don't have to. "CONFIG_PROVE_RAW_LOCK_NESTING" looks if you try to
> > acquire raw_spinlock_t -> spinlock_t. Which it did before I made the
> > patch.
> > The pci_lock is from drivers/pci/access.c and is defined as
> > raw_spinlock_t. And I made pci_poke_lock of the same time. But debug
> > says 3:3 which suggests LD_WAIT_CONFIG. (No patch applied).
> >
> > > FWIW, I don't understand the value of
> > > pr_warn("context-{%d:%d}\n", curr_inner, curr_inner);
> > > Why print curr_inner twice ?
> >
> > The syntax was once (or is) inner:outer. If you look from the top, you
> > have 4 (mutex_t) followed pci_lock (the raw_spinlock_t) 2. You are at
> > level 2 now and try to acquire spin_lock_t (3).
> >
>
> How does that explain the
> context-{5:5}
This is max value based on context. Your context is a simple process.
Not handling an interrupt or anything of this kind.
The culprit is
| swapper/0/1 is trying to lock:
| 0000000001b694c8 (pci_poke_lock){....}-{3:3}, at: pci_config_read16+0x8/0x80
where you have pci_poke_lock classified as a 3. The context allows a 5
so based on the context, the 3 would fly. But since pci_lock is a 2 we
have the splat here.
> which is created from the following ?
> pr_warn("context-{%d:%d}\n", curr_inner, curr_inner);
>
> Again, why print curr_inner twice ?
It is the same syntax as in print_lock_name(). Except here, we don't
have an outer type. The difference is RCU because it has a lower type
than a spinlock_t and you can acquire a spinlock_t within an RCU
section and lockdep is fine with it. It comes yelling once you try this
with a mutex_t.
> Thanks,
> Guenter
Sebastian
Powered by blists - more mailing lists