[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <72991b83-173e-492e-a4aa-5049304c1bd0@roeck-us.net>
Date: Mon, 25 Nov 2024 11:23:18 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: sparclinux@...r.kernel.org, linux-kernel@...r.kernel.org,
Boqun Feng <boqun.feng@...il.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>,
Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>
Subject: Re: [PATCH] sparc/pci: Make pci_poke_lock a raw_spinlock_t.
On 11/25/24 10:12, Sebastian Andrzej Siewior wrote:
> On 2024-11-25 09:59:09 [-0800], Guenter Roeck wrote:
>> On 11/25/24 09:43, Sebastian Andrzej Siewior wrote:
>>> On 2024-11-25 09:01:33 [-0800], Guenter Roeck wrote:
>>>> Unfortunately it doesn't make a difference.
>>>
>>> stunning. It looks like the exact same error message.
>>>
>>
>> I think it uses
>>
>> #define spin_lock_irqsave(lock, flags) \
>> do { \
>> raw_spin_lock_irqsave(spinlock_check(lock), flags); \
>> } while (0)
>>
>> from include/linux/spinlock.h, meaning your patch doesn't really make a difference.
>
> The difference comes from DEFINE_SPINLOCK vs DEFINE_RAW_SPINLOCK. There
> is the .lock_type init which goes from LD_WAIT_CONFIG to LD_WAIT_SPIN.
> And this is all it matters.
>
Ah, now I get it. Thanks for the explanation. And it turns out my log was wrong.
I must have taken it from the old image. Sorry for that.
That specific backtrace isn't seen anymore. But there is another one.
[ 1.779653] =============================
[ 1.779860] [ BUG: Invalid wait context ]
[ 1.780139] 6.12.0+ #1 Not tainted
[ 1.780394] -----------------------------
[ 1.780600] swapper/0/1 is trying to lock:
[ 1.780824] 0000000001b68888 (cpu_map_lock){....}-{3:3}, at: map_to_cpu+0x10/0x80
[ 1.781393] other info that might help us debug this:
[ 1.781624] context-{5:5}
[ 1.781838] 3 locks held by swapper/0/1:
[ 1.782055] #0: fffff800042b90f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x80/0x160
[ 1.782345] #1: fffff800040f2c18 (&desc->request_mutex){+.+.}-{4:4}, at: __setup_irq+0xa0/0x6e0
[ 1.782632] #2: fffff800040f2ab0 (&irq_desc_lock_class){....}-{2:2}, at: __setup_irq+0xc8/0x6e0
[ 1.782912] stack backtrace:
[ 1.783172] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0+ #1
[ 1.783498] Call Trace:
[ 1.783734] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
[ 1.783971] [<00000000004e63e8>] lock_acquire+0xe8/0x340
[ 1.784191] [<00000000010f0dbc>] _raw_spin_lock_irqsave+0x3c/0x80
[ 1.784417] [<000000000043ed90>] map_to_cpu+0x10/0x80
[ 1.784633] [<000000000042b2b8>] sun4u_irq_enable+0x18/0x80
[ 1.784854] [<00000000004fb6b4>] irq_enable+0x34/0xc0
[ 1.785069] [<00000000004fb7b8>] __irq_startup+0x78/0xe0
[ 1.785287] [<00000000004fb8f0>] irq_startup+0xd0/0x1a0
[ 1.785503] [<00000000004f85b4>] __setup_irq+0x5f4/0x6e0
[ 1.785726] [<00000000004f8754>] request_threaded_irq+0xb4/0x1a0
[ 1.785950] [<0000000000439930>] power_probe+0x70/0xe0
[ 1.786165] [<0000000000c13a68>] platform_probe+0x28/0x80
[ 1.786382] [<0000000000c11178>] really_probe+0xb8/0x340
[ 1.786599] [<0000000000c115a4>] driver_probe_device+0x24/0xe0
[ 1.786820] [<0000000000c117cc>] __driver_attach+0x8c/0x160
[ 1.787039] [<0000000000c0ef74>] bus_for_each_dev+0x54/0xc0
After replacing cpu_map_lock with a raw spinlock, I get:
[ 2.015140] =============================
[ 2.015247] [ BUG: Invalid wait context ]
[ 2.015419] 6.12.0+ #1 Not tainted
[ 2.015564] -----------------------------
[ 2.015668] swapper/0/1 is trying to lock:
[ 2.015791] fffff80004870610 (&mm->context.lock){....}-{3:3}, at: __schedule+0x410/0x5b0
[ 2.016306] other info that might help us debug this:
[ 2.016451] context-{5:5}
[ 2.016539] 3 locks held by swapper/0/1:
[ 2.016652] #0: 0000000001d11f38 (key_types_sem){++++}-{4:4}, at: __key_create_or_update+0x5c/0x4c0
[ 2.016934] #1: 0000000001d1b850 (asymmetric_key_parsers_sem){++++}-{4:4}, at: asymmetric_key_preparse+0x18/0xa0
[ 2.017197] #2: fffff8001f811a98 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0xdc/0x5b0
[ 2.017412] stack backtrace:
[ 2.017551] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0+ #1
[ 2.017800] Call Trace:
[ 2.017910] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
[ 2.018062] [<00000000004e63e8>] lock_acquire+0xe8/0x340
[ 2.018192] [<00000000010f0dbc>] _raw_spin_lock_irqsave+0x3c/0x80
[ 2.018341] [<00000000010e5050>] __schedule+0x410/0x5b0
[ 2.018469] [<00000000010e5ae4>] schedule+0x44/0x1c0
[ 2.018591] [<00000000010f0684>] schedule_timeout+0xa4/0x100
[ 2.018730] [<00000000010e668c>] __wait_for_common+0xac/0x1a0
[ 2.018869] [<00000000010e6878>] wait_for_completion_state+0x18/0x40
[ 2.019022] [<000000000048ad18>] call_usermodehelper_exec+0x138/0x1c0
[ 2.019177] [<000000000052eb40>] __request_module+0x160/0x2e0
[ 2.019316] [<00000000009ba6dc>] crypto_alg_mod_lookup+0x17c/0x280
[ 2.019466] [<00000000009ba990>] crypto_alloc_tfm_node+0x30/0x100
[ 2.019614] [<00000000009dcc5c>] public_key_verify_signature+0xbc/0x260
[ 2.019772] [<00000000009ded8c>] x509_check_for_self_signed+0xac/0x280
[ 2.019928] [<00000000009dddec>] x509_cert_parse+0x14c/0x220
[ 2.020065] [<00000000009dea08>] x509_key_preparse+0x8/0x1e0
The problem here is
typedef struct {
spinlock_t lock; <--
unsigned long sparc64_ctx_val;
unsigned long hugetlb_pte_count;
unsigned long thp_pte_count;
struct tsb_config tsb_block[MM_NUM_TSBS];
struct hv_tsb_descr tsb_descr[MM_NUM_TSBS];
void *vdso;
bool adi;
tag_storage_desc_t *tag_store;
spinlock_t tag_lock;
} mm_context_t;
Replacing that with a raw spinlock just triggers the next one.
[ 2.035384] =============================
[ 2.035490] [ BUG: Invalid wait context ]
[ 2.035660] 6.12.0+ #3 Not tainted
[ 2.035802] -----------------------------
[ 2.035906] kworker/u4:3/48 is trying to lock:
[ 2.036036] 0000000001b6a790 (ctx_alloc_lock){....}-{3:3}, at: get_new_mmu_context+0x14/0x280
[ 2.036558] other info that might help us debug this:
[ 2.036697] context-{5:5}
[ 2.036784] 4 locks held by kworker/u4:3/48:
[ 2.036906] #0: fffff80004838a70 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xc/0x8e0
[ 2.037169] #1: fffff80004838b08 (&sig->exec_update_lock){+.+.}-{4:4}, at: begin_new_exec+0x344/0xbe0
[ 2.037411] #2: fffff800047fc940 (&p->alloc_lock){+.+.}-{3:3}, at: begin_new_exec+0x3a0/0xbe0
[ 2.037639] #3: fffff80004848610 (&mm->context.lock){....}-{2:2}, at: begin_new_exec+0x41c/0xbe0
Fixing that finally gives me a clean run. Nevertheless, that makes me wonder:
Should I just disable CONFIG_PROVE_RAW_LOCK_NESTING for sparc runtime tests ?
Thanks,
Guenter
Powered by blists - more mailing lists