[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d269249-afd1-44f5-8faf-9ac11d9a3beb@redhat.com>
Date: Mon, 25 Nov 2024 14:33:04 -0500
From: Waiman Long <llong@...hat.com>
To: Guenter Roeck <linux@...ck-us.net>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: sparclinux@...r.kernel.org, linux-kernel@...r.kernel.org,
Boqun Feng <boqun.feng@...il.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will@...nel.org>, "David S. Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>
Subject: Re: [PATCH] sparc/pci: Make pci_poke_lock a raw_spinlock_t.
On 11/25/24 2:23 PM, Guenter Roeck wrote:
> On 11/25/24 10:12, Sebastian Andrzej Siewior wrote:
>> On 2024-11-25 09:59:09 [-0800], Guenter Roeck wrote:
>>> On 11/25/24 09:43, Sebastian Andrzej Siewior wrote:
>>>> On 2024-11-25 09:01:33 [-0800], Guenter Roeck wrote:
>>>>> Unfortunately it doesn't make a difference.
>>>>
>>>> stunning. It looks like the exact same error message.
>>>>
>>>
>>> I think it uses
>>>
>>> #define spin_lock_irqsave(lock, flags) \
>>> do { \
>>> raw_spin_lock_irqsave(spinlock_check(lock), flags); \
>>> } while (0)
>>>
>>> from include/linux/spinlock.h, meaning your patch doesn't really
>>> make a difference.
>>
>> The difference comes from DEFINE_SPINLOCK vs DEFINE_RAW_SPINLOCK. There
>> is the .lock_type init which goes from LD_WAIT_CONFIG to LD_WAIT_SPIN.
>> And this is all it matters.
>>
>
> Ah, now I get it. Thanks for the explanation. And it turns out my log
> was wrong.
> I must have taken it from the old image. Sorry for that.
>
> That specific backtrace isn't seen anymore. But there is another one.
>
> [ 1.779653] =============================
> [ 1.779860] [ BUG: Invalid wait context ]
> [ 1.780139] 6.12.0+ #1 Not tainted
> [ 1.780394] -----------------------------
> [ 1.780600] swapper/0/1 is trying to lock:
> [ 1.780824] 0000000001b68888 (cpu_map_lock){....}-{3:3}, at:
> map_to_cpu+0x10/0x80
> [ 1.781393] other info that might help us debug this:
> [ 1.781624] context-{5:5}
> [ 1.781838] 3 locks held by swapper/0/1:
> [ 1.782055] #0: fffff800042b90f8 (&dev->mutex){....}-{4:4}, at:
> __driver_attach+0x80/0x160
> [ 1.782345] #1: fffff800040f2c18
> (&desc->request_mutex){+.+.}-{4:4}, at: __setup_irq+0xa0/0x6e0
> [ 1.782632] #2: fffff800040f2ab0
> (&irq_desc_lock_class){....}-{2:2}, at: __setup_irq+0xc8/0x6e0
> [ 1.782912] stack backtrace:
> [ 1.783172] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 6.12.0+ #1
> [ 1.783498] Call Trace:
> [ 1.783734] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
> [ 1.783971] [<00000000004e63e8>] lock_acquire+0xe8/0x340
> [ 1.784191] [<00000000010f0dbc>] _raw_spin_lock_irqsave+0x3c/0x80
> [ 1.784417] [<000000000043ed90>] map_to_cpu+0x10/0x80
> [ 1.784633] [<000000000042b2b8>] sun4u_irq_enable+0x18/0x80
> [ 1.784854] [<00000000004fb6b4>] irq_enable+0x34/0xc0
> [ 1.785069] [<00000000004fb7b8>] __irq_startup+0x78/0xe0
> [ 1.785287] [<00000000004fb8f0>] irq_startup+0xd0/0x1a0
> [ 1.785503] [<00000000004f85b4>] __setup_irq+0x5f4/0x6e0
> [ 1.785726] [<00000000004f8754>] request_threaded_irq+0xb4/0x1a0
> [ 1.785950] [<0000000000439930>] power_probe+0x70/0xe0
> [ 1.786165] [<0000000000c13a68>] platform_probe+0x28/0x80
> [ 1.786382] [<0000000000c11178>] really_probe+0xb8/0x340
> [ 1.786599] [<0000000000c115a4>] driver_probe_device+0x24/0xe0
> [ 1.786820] [<0000000000c117cc>] __driver_attach+0x8c/0x160
> [ 1.787039] [<0000000000c0ef74>] bus_for_each_dev+0x54/0xc0
>
> After replacing cpu_map_lock with a raw spinlock, I get:
>
> [ 2.015140] =============================
> [ 2.015247] [ BUG: Invalid wait context ]
> [ 2.015419] 6.12.0+ #1 Not tainted
> [ 2.015564] -----------------------------
> [ 2.015668] swapper/0/1 is trying to lock:
> [ 2.015791] fffff80004870610 (&mm->context.lock){....}-{3:3}, at:
> __schedule+0x410/0x5b0
> [ 2.016306] other info that might help us debug this:
> [ 2.016451] context-{5:5}
> [ 2.016539] 3 locks held by swapper/0/1:
> [ 2.016652] #0: 0000000001d11f38 (key_types_sem){++++}-{4:4}, at:
> __key_create_or_update+0x5c/0x4c0
> [ 2.016934] #1: 0000000001d1b850
> (asymmetric_key_parsers_sem){++++}-{4:4}, at:
> asymmetric_key_preparse+0x18/0xa0
> [ 2.017197] #2: fffff8001f811a98 (&rq->__lock){-.-.}-{2:2}, at:
> __schedule+0xdc/0x5b0
> [ 2.017412] stack backtrace:
> [ 2.017551] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 6.12.0+ #1
> [ 2.017800] Call Trace:
> [ 2.017910] [<00000000004e31d0>] __lock_acquire+0xa50/0x3160
> [ 2.018062] [<00000000004e63e8>] lock_acquire+0xe8/0x340
> [ 2.018192] [<00000000010f0dbc>] _raw_spin_lock_irqsave+0x3c/0x80
> [ 2.018341] [<00000000010e5050>] __schedule+0x410/0x5b0
> [ 2.018469] [<00000000010e5ae4>] schedule+0x44/0x1c0
> [ 2.018591] [<00000000010f0684>] schedule_timeout+0xa4/0x100
> [ 2.018730] [<00000000010e668c>] __wait_for_common+0xac/0x1a0
> [ 2.018869] [<00000000010e6878>] wait_for_completion_state+0x18/0x40
> [ 2.019022] [<000000000048ad18>] call_usermodehelper_exec+0x138/0x1c0
> [ 2.019177] [<000000000052eb40>] __request_module+0x160/0x2e0
> [ 2.019316] [<00000000009ba6dc>] crypto_alg_mod_lookup+0x17c/0x280
> [ 2.019466] [<00000000009ba990>] crypto_alloc_tfm_node+0x30/0x100
> [ 2.019614] [<00000000009dcc5c>]
> public_key_verify_signature+0xbc/0x260
> [ 2.019772] [<00000000009ded8c>] x509_check_for_self_signed+0xac/0x280
> [ 2.019928] [<00000000009dddec>] x509_cert_parse+0x14c/0x220
> [ 2.020065] [<00000000009dea08>] x509_key_preparse+0x8/0x1e0
>
> The problem here is
>
> typedef struct {
> spinlock_t lock; <--
> unsigned long sparc64_ctx_val;
> unsigned long hugetlb_pte_count;
> unsigned long thp_pte_count;
> struct tsb_config tsb_block[MM_NUM_TSBS];
> struct hv_tsb_descr tsb_descr[MM_NUM_TSBS];
> void *vdso;
> bool adi;
> tag_storage_desc_t *tag_store;
> spinlock_t tag_lock;
> } mm_context_t;
>
> Replacing that with a raw spinlock just triggers the next one.
>
> [ 2.035384] =============================
> [ 2.035490] [ BUG: Invalid wait context ]
> [ 2.035660] 6.12.0+ #3 Not tainted
> [ 2.035802] -----------------------------
> [ 2.035906] kworker/u4:3/48 is trying to lock:
> [ 2.036036] 0000000001b6a790 (ctx_alloc_lock){....}-{3:3}, at:
> get_new_mmu_context+0x14/0x280
> [ 2.036558] other info that might help us debug this:
> [ 2.036697] context-{5:5}
> [ 2.036784] 4 locks held by kworker/u4:3/48:
> [ 2.036906] #0: fffff80004838a70
> (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xc/0x8e0
> [ 2.037169] #1: fffff80004838b08
> (&sig->exec_update_lock){+.+.}-{4:4}, at: begin_new_exec+0x344/0xbe0
> [ 2.037411] #2: fffff800047fc940 (&p->alloc_lock){+.+.}-{3:3}, at:
> begin_new_exec+0x3a0/0xbe0
> [ 2.037639] #3: fffff80004848610 (&mm->context.lock){....}-{2:2},
> at: begin_new_exec+0x41c/0xbe0
>
> Fixing that finally gives me a clean run. Nevertheless, that makes me
> wonder:
> Should I just disable CONFIG_PROVE_RAW_LOCK_NESTING for sparc runtime
> tests ?
If no one is tryng to ever enable PREEMPT_RT on SPARC, I suppose you
could disable CONFIG_PROVE_RAW_LOCK_NESTING to avoid the trouble.
Cheers,
Longman
Powered by blists - more mailing lists