netdev - Re: [syzbot] [bpf?] possible deadlock in __bpf_ringbuf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50f069c5-d962-4743-a8b0-dc1bc4811599@gmail.com>
Date: Tue, 26 Aug 2025 10:20:22 +0800
From: Leon Hwang <hffilwlqm@...il.com>
To: syzbot <syzbot+fa5c2814795b5adca240@...kaller.appspotmail.com>,
 andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
 daniel@...earbox.net, eddyz87@...il.com, haoluo@...gle.com,
 john.fastabend@...il.com, jolsa@...nel.org, kpsingh@...nel.org,
 linux-kernel@...r.kernel.org, martin.lau@...ux.dev, netdev@...r.kernel.org,
 sdf@...ichev.me, song@...nel.org, syzkaller-bugs@...glegroups.com,
 yonghong.song@...ux.dev
Subject: Re: [syzbot] [bpf?] possible deadlock in __bpf_ringbuf_reserve (2)



On 26/8/25 01:39, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    dd9de524183a xsk: Fix immature cq descriptor production
> git tree:       bpf
> console output: https://syzkaller.appspot.com/x/log.txt?x=102da862580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c321f33e4545e2a1
> dashboard link: https://syzkaller.appspot.com/bug?extid=fa5c2814795b5adca240
> compiler:       Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=142da862580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1588aef0580000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/5a3389c1558f/disk-dd9de524.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/c97133192a27/vmlinux-dd9de524.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/3ae5a1a88637/bzImage-dd9de524.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+fa5c2814795b5adca240@...kaller.appspotmail.com
> 
> ============================================
> WARNING: possible recursive locking detected
> syzkaller #0 Not tainted
> --------------------------------------------
> syz-execprog/5866 is trying to acquire lock:
> ffffc900048c10d8 (&rb->spinlock){-.-.}-{2:2}, at: __bpf_ringbuf_reserve+0x1c7/0x5a0 kernel/bpf/ringbuf.c:423
> 
> but task is already holding lock:
> ffffc900048e90d8 (&rb->spinlock){-.-.}-{2:2}, at: __bpf_ringbuf_reserve+0x1c7/0x5a0 kernel/bpf/ringbuf.c:423
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&rb->spinlock);
>   lock(&rb->spinlock);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation

Confirmed.

I can reproduce this deadlock issue and will work on a fix.

Thanks,
Leon

> 
> 6 locks held by syz-execprog/5866:
>  #0: ffff88807e021588 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0x19f/0x3d0 mm/mmap_lock.c:147
>  #1: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
>  #1: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
>  #1: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: ___pte_offset_map+0x29/0x250 mm/pgtable-generic.c:286
>  #2: ffff8880787b60d8 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: spin_lock include/linux/spinlock.h:351 [inline]
>  #2: ffff8880787b60d8 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: __pte_offset_map_lock+0x13e/0x210 mm/pgtable-generic.c:401
>  #3: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
>  #3: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
>  #3: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2256 [inline]
>  #3: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: bpf_trace_run2+0x186/0x4b0 kernel/trace/bpf_trace.c:2298
>  #4: ffffc900048e90d8 (&rb->spinlock){-.-.}-{2:2}, at: __bpf_ringbuf_reserve+0x1c7/0x5a0 kernel/bpf/ringbuf.c:423
>  #5: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
>  #5: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
>  #5: ffffffff8e139ea0 (rcu_read_lock){....}-{1:3}, at: trace_call_bpf+0xb7/0x850 kernel/trace/bpf_trace.c:-1
> 
> stack backtrace:
> CPU: 1 UID: 0 PID: 5866 Comm: syz-execprog Not tainted syzkaller #0 PREEMPT(full) 
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
>  print_deadlock_bug+0x28b/0x2a0 kernel/locking/lockdep.c:3041
>  check_deadlock kernel/locking/lockdep.c:3093 [inline]
>  validate_chain+0x1a3f/0x2140 kernel/locking/lockdep.c:3895
>  __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5237
>  lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>  _raw_spin_lock_irqsave+0xa7/0xf0 kernel/locking/spinlock.c:162
>  __bpf_ringbuf_reserve+0x1c7/0x5a0 kernel/bpf/ringbuf.c:423
>  ____bpf_ringbuf_reserve kernel/bpf/ringbuf.c:474 [inline]
>  bpf_ringbuf_reserve+0x5c/0x70 kernel/bpf/ringbuf.c:466
>  bpf_prog_df2ea1bb7efca089+0x36/0x54
>  bpf_dispatcher_nop_func include/linux/bpf.h:1332 [inline]
>  __bpf_prog_run include/linux/filter.h:718 [inline]
>  bpf_prog_run include/linux/filter.h:725 [inline]
>  bpf_prog_run_array include/linux/bpf.h:2292 [inline]
>  trace_call_bpf+0x326/0x850 kernel/trace/bpf_trace.c:146
>  perf_trace_run_bpf_submit+0x78/0x170 kernel/events/core.c:10911
>  do_perf_trace_contention_end include/trace/events/lock.h:122 [inline]
>  perf_trace_contention_end+0x253/0x2f0 include/trace/events/lock.h:122
>  __do_trace_contention_end include/trace/events/lock.h:122 [inline]
>  trace_contention_end+0x111/0x140 include/trace/events/lock.h:122
>  __pv_queued_spin_lock_slowpath+0x9f9/0xb60 kernel/locking/qspinlock.c:374
>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:557 [inline]
>  queued_spin_lock_slowpath+0x43/0x50 arch/x86/include/asm/qspinlock.h:51
>  queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
>  do_raw_spin_lock+0x21f/0x290 kernel/locking/spinlock_debug.c:116
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
>  _raw_spin_lock_irqsave+0xb3/0xf0 kernel/locking/spinlock.c:162
>  __bpf_ringbuf_reserve+0x1c7/0x5a0 kernel/bpf/ringbuf.c:423
>  ____bpf_ringbuf_reserve kernel/bpf/ringbuf.c:474 [inline]
>  bpf_ringbuf_reserve+0x5c/0x70 kernel/bpf/ringbuf.c:466
>  bpf_prog_6979e45824a16319+0x36/0x66
>  bpf_dispatcher_nop_func include/linux/bpf.h:1332 [inline]
>  __bpf_prog_run include/linux/filter.h:718 [inline]
>  bpf_prog_run include/linux/filter.h:725 [inline]
>  __bpf_trace_run kernel/trace/bpf_trace.c:2257 [inline]
>  bpf_trace_run2+0x281/0x4b0 kernel/trace/bpf_trace.c:2298
>  __bpf_trace_tlb_flush+0xf5/0x150 include/trace/events/tlb.h:38
>  __traceiter_tlb_flush+0x76/0xd0 include/trace/events/tlb.h:38
>  __do_trace_tlb_flush include/trace/events/tlb.h:38 [inline]
>  trace_tlb_flush+0x115/0x140 include/trace/events/tlb.h:38
>  native_flush_tlb_multi+0x78/0x140 arch/x86/mm/tlb.c:-1
>  __flush_tlb_multi arch/x86/include/asm/paravirt.h:91 [inline]
>  flush_tlb_multi arch/x86/mm/tlb.c:1361 [inline]
>  flush_tlb_mm_range+0x6b1/0x12d0 arch/x86/mm/tlb.c:1451
>  flush_tlb_page arch/x86/include/asm/tlbflush.h:324 [inline]
>  ptep_clear_flush+0x120/0x170 mm/pgtable-generic.c:101
>  wp_page_copy mm/memory.c:3618 [inline]
>  do_wp_page+0x1bc2/0x5800 mm/memory.c:4013
>  handle_pte_fault mm/memory.c:6068 [inline]
>  __handle_mm_fault+0x1033/0x5440 mm/memory.c:6195
>  handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364
>  do_user_addr_fault+0xa81/0x1390 arch/x86/mm/fault.c:1336
>  handle_page_fault arch/x86/mm/fault.c:1476 [inline]
>  exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532
>  asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> RIP: 0033:0x410964
> Code: b9 01 00 00 00 90 e8 3b 36 06 00 84 00 48 8d 50 10 83 3d be 5f d9 02 00 74 10 e8 87 d9 06 00 49 89 13 48 8b 48 10 49 89 4b 08 <48> 89 50 10 48 89 c1 48 8b 54 24 20 48 8b 1a 66 89 59 18 83 3d 92
> RSP: 002b:000000c0000e7608 EFLAGS: 00010246
> RAX: 000000c0080e0000 RBX: 0000000000000070 RCX: 0000000007b2a8a0
> RDX: 000000c0080e0010 RSI: 000000000b1130e1 RDI: 0000000009e14d67
> RBP: 000000c0000e7638 R08: 00007f16e8ad96e0 R09: 7fffffffffffffff
> R10: 0000000000000001 R11: 00007f16e8a2e000 R12: 000000c0080e0000
> R13: 0000000000000049 R14: 000000c0026156c0 R15: 000000c0080c1c20
>  </TASK>
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@...glegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
>