lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250725003315.3163-1-hdanton@sina.com>
Date: Fri, 25 Jul 2025 08:33:14 +0800
From: Hillf Danton <hdanton@...a.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Waiman Long <llong@...hat.com>,
	Pavel Machek <pavel@....cz>,
	LKML <linux-kernel@...r.kernel.org>,
	Petr Mladek <pmladek@...e.com>,
	John Ogness <jogness@...utronix.de>,
	Tejun Heo <tj@...nel.org>
Subject: Re: locking problems in iwlwifi? was Re: 6.16-rcX: crashing way too often on thinkpad X220

On Thu, 24 Jul 2025 23:24:26 +0200 Thomas Gleixner wrote:
>On Thu, Jul 24 2025 at 13:55, Waiman Long wrote:
>> On 7/24/25 1:51 PM, Waiman Long wrote:
>>>> [   54.284095] ================================
>>>> [   54.284097] WARNING: inconsistent lock state
>>>> [   54.284100] 6.16.0-rc7+ #305 Tainted: G S
>>>> [   54.284104] --------------------------------
>>>> [   54.284105] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>>>> [   54.284108] wpa_supplicant/2940 [HC0[0]:SC0[0]:HE0:SE1] takes:
>>>> [   54.284114] ffffffff86263fe0 (console_owner){+.?.}-{0:0}, at: 
>>>> console_lock_spinning_enable+0x3d/0x60
>>>
>>> The lockdep warning just means that console_owner_lock is acquired 
>>> both in softirq context and in task context with interrupt enabled. 
>>> That can leads to deadlock. So the remedy is to always take 
>>> console_owner_lock with interrupt disabled, i.e. with 
>>> raw_spin_lock_irqsave/raw_spin_lock_irqrestore.
>>
>> I suppose that this lock can also be acquired in hardirq context. So a 
>> similar HARDIRQ warning can be printed if that happens first.
>
> None of this makes any sense whatsoever.
> 
> Both invocations of console_lock_spinning_enable() are within a hard
> interrupt disabled section.
> 
> 		printk_safe_enter_irqsave(flags);
> 		console_lock_spinning_enable();
> 
> The lockdep map which is printed here is not console_owner_lock,
> it's console_owner, which is a software managed lockdep_map:
> 
> static struct lockdep_map console_owner_dep_map = {
> 	.name = "console_owner"
> };
> 
> It's touched in console_lock_spinning_enable(),
> console_lock_spinning_disable_and_check() and console_trylock_spinning().
> 
> In all cases with interrupts disabled and all sites use
> printk_safe_enter_irqsave() which internally uses local_irq_save().
> 
> Now lets go back to the initial report, which is further down:
> 
> [   54.285777] raw_local_irq_restore() called with IRQs enabled
> [   54.285788] WARNING: CPU: 3 PID: 2940 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x25/0x30
> [   54.285801] Modules linked in:
> [   54.285807] CPU: 3 UID: 0 PID: 2940 Comm: wpa_supplicant Tainted: G S                  6.16.0-rc7+ #305 PREEMPT(voluntary) 
> [   54.285814] Tainted: [S]=CPU_OUT_OF_SPEC
> [   54.285817] Hardware name: LENOVO 4291W3B/4291W3B, BIOS 8DET73WW (1.43 ) 10/12/2016
> [   54.285820] RIP: 0010:warn_bogus_irq_restore+0x25/0x30
> [   54.285828] Code: 90 90 90 90 90 80 3d c2 5f e3 00 00 74 05 c3 cc cc cc cc 55 48 c7 c7 98 2f fa 85 c6 05 ac 5f e3 00 01 48 89 e5 e8 db 23 be fe <0f> 0b
> 5d c3 cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90
> [   54.285834] RSP: 0018:ffffc90001acb690 EFLAGS: 00010282
> [   54.285839] RAX: 0000000000000000 RBX: ffffffff863ed460 RCX: 0000000000000000
> [   54.285843] RDX: 0000000000000003 RSI: 0000000000000027 RDI: 00000000ffffffff
> [   54.285847] RBP: ffffc90001acb690 R08: 0000000000000000 R09: 0000000000000000
> [   54.285850] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> [   54.285853] R13: ffffc90001acb72b R14: 0000000000000200 R15: 0000000000000000
> [   54.285857] FS:  00007f79e08da1c0(0000) GS:ffff88828fcf5000(0000) knlGS:0000000000000000
> [   54.285862] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   54.285865] CR2: 000055b73b455000 CR3: 000000010b07e005 CR4: 00000000000606b0
> [   54.285869] Call Trace:
> [   54.285872]  <TASK>
> [   54.285874]  console_flush_all+0x47d/0x4d0
> [   54.285880]  ? console_flush_all+0x43/0x4d0
> [   54.285885]  ? console_flush_all+0x2d2/0x4d0
> [   54.285892]  console_unlock+0x55/0x100
> [   54.285924]  ieee80211_mgd_auth+0x2a3/0x5e0
> [   54.285934]  ? __this_cpu_preempt_check+0x13/0x20
> [   54.285940]  ieee80211_auth+0x13/0x20
> 
> I asked for decoding that console_flush_all+0x47d/0x4d0 line, but I'm
> 100% sure now that it is line 3128 in console_emit_next_record() which
> is inlined into console_flush_all()
> 
>      printk_safe_exit_irqrestore(flags);
> 
> Pavel, can you confirm that?
> 
> But let's look at the other stack trace in the lockdep splat which is
> way more interesting:
> 
> 	       stack backtrace:
> [   54.284433] CPU: 3 UID: 0 PID: 2940 Comm: wpa_supplicant Tainted: G S                  6.16.0-rc7+ #305 PREEMPT(voluntary) 
> [   54.284441] Tainted: [S]=CPU_OUT_OF_SPEC
> [   54.284443] Hardware name: LENOVO 4291W3B/4291W3B, BIOS 8DET73WW (1.43 ) 10/12/2016
> [   54.284446] Call Trace:
> [   54.284448]  <TASK>
> [   54.284450]  dump_stack_lvl+0x88/0xd0
> [   54.284458]  dump_stack+0x10/0x20
> [   54.284463]  print_usage_bug.part.0+0x237/0x2d0
> [   54.284470]  mark_lock.part.0+0xa9c/0xfb0
> [   54.284479]  mark_held_locks+0x4d/0x80
> [   54.284486]  lockdep_hardirqs_on_prepare+0xff/0x1c0
> [   54.284493]  trace_hardirqs_on+0x5a/0xe0
> [   54.284500]  _raw_spin_unlock_irq+0x23/0x60
> [   54.284505]  __flush_work+0x3b4/0x550
> [   54.284513]  ? __timer_delete+0x2f/0xd0
> [   54.284519]  ? timer_delete+0xb/0x20
> [   54.284524]  ? try_to_grab_pending+0x12a/0x320
> [   54.284534]  cancel_delayed_work_sync+0x65/0x70
> [   54.284539]  fbcon_cursor+0xbe/0x160
> [   54.284548]  hide_cursor+0x2c/0xc0
> [   54.284553]  vt_console_print+0x45e/0x470
> [   54.284560]  console_flush_all+0x301/0x4d0
> [   54.284565]  ? console_flush_all+0x2d2/0x4d0
> [   54.284569]  ? console_flush_all+0x43/0x4d0
> [   54.284572]  ? console_flush_all+0x2d2/0x4d0
> [   54.284579]  console_unlock+0x55/0x100
> [   54.284584]  vprintk_emit+0x15b/0x3a0
> [   54.284590]  vprintk_default+0x18/0x20
> [   54.284595]  vprintk+0x9/0x10
> [   54.284600]  _printk+0x52/0x70
> [   54.284608]  ieee80211_mgd_auth+0x2a3/0x5e0
> [   54.284619]  ? __this_cpu_preempt_check+0x13/0x20
> [   54.284625]  ieee80211_auth+0x13/0x20
> 
> That's the same call chain as the above which complains about the
> already enabled interrupt.
> 
> So the lockdep splat is telling us that the raw_spin_unlock_irq() in the
> workqueue code is making console_owner unsafe.
> 
Point A

> Now let's look where this comes from:
> 
>         console_flush_all+0x301/0x4d0
> 
> invokes
> 
>         vt_console_print+0x45e/0x470
> 
> via
> 
>         console_emit_next_record()
>         ...
>            printk_safe_enter_irqsave(flags);
>            ...
>            con->write(con, outbuf, pmsg.outbuf_len);
> 
> Again decoding console_flush_all+0x301/0x4d0 should confirm that.
> 
> Now that ends up in cancel_delayed_work_sync() via hide_cursor() ->
> fbcon_cursor().
> 
> cancel_delayed_work_sync() has a might_sleep() in it _and_
> __flush_work() uses raw_spin_[un]lock_irq() as it requires to be invoked
> in thread or in some cases in BH context.
> 
> vt_console_print() is the write() callback of the VT console driver. As
> that is a legacy console, it can't do that.
> 
> Only the new NBCON variants are allowed to sleep in their write()
> callbacks because they are running in their own printer thread
> context. For atomic printouts in case of emergency they can have a
> write_atomic() callback. If they don't then printing is not attempted,
> but that's not relevant for vt_console_print() obviously.
> 
> So lets look at that call chain further
> 
>    vt_console_print()
>      hide_cursor()
>       vc->vc_sw->con_cursor(vc, false); --> fbcon_cursor()
> 
>       fbcon_cursor()
>         if (vc->vc_cursor_type & CUR_SW)
>            fbcon_del_cursor_work(info)
>              cancel_delayed_work_sync(&ops->cursor_work);
> 
    cancel_delayed_work_sync(&ops->cursor_work);
      __cancel_work_sync()
	__cancel_work()
	  local_irq_save()
	  ...
	  local_irq_restore() // Point B
	__flush_work()
	  start_flush_work()
	    raw_spin_lock_irq(&pool->lock);
	    ...
	    raw_spin_unlock_irq(&pool->lock); // Point C

The difference between Point B and C wrt irq creates Point A.

> Here we are.
> 
> Now I checked whether this is a recent change, but I have to say it has
> been that way for a long time. It's a very interesting question why this
> never popped up before.
> 
> The only explanation I have so far is that the code path in
> cancel_delayed_work_sync() which uses [un]lock_irq() is only reached,
> when the works is executed at that point.
> 
> Unfortunatly the might_sleep() check in __cancel_work_sync() is after
> the point which determines that.
> 
> So it's a matter of debug options being enabled and the probability of
> hitting a work which is actually executed to trip over this problem.
> 
No, debug makes no sense. Nor does a running work.

> I have no fix for that, but at least there should be some clues now
> where to tackle this.
> 
Yes

> It's definitely not a printk bug. The culprit was hiding in plain sight
> for a long time in the fbcon_console code.
> 
Nope because of Point A.

Hillf

> Thanks,
> 
>         tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ