[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAM0EoMmK7TJ4w_heeMuD+YmUdMyEz7VWKY+a+qMO2UN4GYZ5jQ@mail.gmail.com>
Date: Sun, 12 Oct 2025 11:52:54 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: syzbot <syzbot+51cd74c5dfeafd65e488@...kaller.appspotmail.com>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>, davem@...emloft.net, dsahern@...nel.org,
hdanton@...a.com, horms@...nel.org, kuba@...nel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, netdev@...r.kernel.org,
pabeni@...hat.com, syzkaller-bugs@...glegroups.com, tglx@...utronix.de,
Vladimir Oltean <vladimir.oltean@....com>
Subject: Re: [syzbot] [net?] [mm?] INFO: rcu detected stall in
inet_rtm_newaddr (2)
On Sun, Oct 12, 2025 at 11:46 AM Jamal Hadi Salim <jhs@...atatu.com> wrote:
>
> On Sat, Oct 11, 2025 at 5:42 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Sat, Oct 11, 2025 at 12:41 AM syzbot
> > <syzbot+51cd74c5dfeafd65e488@...kaller.appspotmail.com> wrote:
> > >
> > > syzbot has found a reproducer for the following issue on:
> > >
> > > HEAD commit: 18a7e218cfcd Merge tag 'net-6.18-rc1' of git://git.kernel...
> > > git tree: net-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=12504dcd980000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=61ab7fa743df0ec1
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=51cd74c5dfeafd65e488
> > > compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14d2a542580000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=142149e2580000
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/7a01e6dce97e/disk-18a7e218.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/5e1b7e41427f/vmlinux-18a7e218.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/69b558601209/bzImage-18a7e218.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+51cd74c5dfeafd65e488@...kaller.appspotmail.com
> > >
> > > sched: DL replenish lagged too much
> > > rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > rcu: 0-...!: (2 GPs behind) idle=7754/1/0x4000000000000000 softirq=15464/15465 fqs=1
> > > rcu: (detected by 1, t=10502 jiffies, g=11321, q=371 ncpus=2)
> > > Sending NMI from CPU 1 to CPUs 0:
> > > NMI backtrace for cpu 0
> > > CPU: 0 UID: 0 PID: 5948 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
> > > RIP: 0010:rb_insert_color_cached include/linux/rbtree.h:113 [inline]
> > > RIP: 0010:rb_add_cached include/linux/rbtree.h:183 [inline]
> > > RIP: 0010:timerqueue_add+0x1a8/0x200 lib/timerqueue.c:40
> > > Code: e7 31 f6 e8 6a 0c de f6 42 80 3c 2b 00 74 08 4c 89 f7 e8 7b 0a de f6 4d 89 26 4d 8d 7e 08 4c 89 f8 48 c1 e8 03 42 80 3c 28 00 <74> 08 4c 89 ff e8 5e 0a de f6 4d 89 27 4d 85 e4 40 0f 95 c5 eb 07
> > > RSP: 0018:ffffc90000007cf0 EFLAGS: 00000046
> > > RAX: 1ffff110170c4f83 RBX: 1ffff110170c4f82 RCX: 0000000000000000
> > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88805de72358
> > > RBP: 0000000000000000 R08: ffff88805de72357 R09: 0000000000000000
> > > R10: ffff88805de72340 R11: ffffed100bbce46b R12: ffff88805de72340
> > > R13: dffffc0000000000 R14: ffff8880b8627c10 R15: ffff8880b8627c18
> > > FS: 000055557c657500(0000) GS:ffff888125d0f000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000200000000600 CR3: 000000002ee76000 CR4: 00000000003526f0
> > > Call Trace:
> > > <IRQ>
> > > __run_hrtimer kernel/time/hrtimer.c:1794 [inline]
> > > __hrtimer_run_queues+0x656/0xc60 kernel/time/hrtimer.c:1841
> > > hrtimer_interrupt+0x45b/0xaa0 kernel/time/hrtimer.c:1903
> > > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1041 [inline]
> > > __sysvec_apic_timer_interrupt+0x108/0x410 arch/x86/kernel/apic/apic.c:1058
> > > instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
> > > sysvec_apic_timer_interrupt+0xa1/0xc0 arch/x86/kernel/apic/apic.c:1052
> > > </IRQ>
> > > <TASK>
> > > asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> > > RIP: 0010:pv_vcpu_is_preempted arch/x86/include/asm/paravirt.h:579 [inline]
> > > RIP: 0010:vcpu_is_preempted arch/x86/include/asm/qspinlock.h:63 [inline]
> > > RIP: 0010:owner_on_cpu include/linux/sched.h:2282 [inline]
> > > RIP: 0010:mutex_spin_on_owner+0x189/0x360 kernel/locking/mutex.c:361
> > > Code: b6 04 30 84 c0 0f 85 59 01 00 00 48 8b 44 24 08 8b 18 48 8b 44 24 48 42 80 3c 30 00 74 0c 48 c7 c7 90 8c fa 8d e8 a7 cd 88 00 <48> 83 3d ff 27 5e 0c 00 0f 84 b9 01 00 00 48 89 df e8 41 e0 d5 ff
> > > RSP: 0018:ffffc900034c7428 EFLAGS: 00000246
> > > RAX: 1ffffffff1bf5192 RBX: 0000000000000001 RCX: ffffffff819c6588
> > > RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff8f4df8a0
> > > RBP: 1ffffffff1e9bf14 R08: ffffffff8f4df8a7 R09: 1ffffffff1e9bf14
> > > R10: dffffc0000000000 R11: fffffbfff1e9bf15 R12: ffffffff8f4df8a0
> > > R13: ffffffff8f4df8f0 R14: dffffc0000000000 R15: ffff8880267a9e40
> > > mutex_optimistic_spin kernel/locking/mutex.c:464 [inline]
> > > __mutex_lock_common kernel/locking/mutex.c:602 [inline]
> > > __mutex_lock+0x311/0x1350 kernel/locking/mutex.c:760
> > > rtnl_net_lock include/linux/rtnetlink.h:130 [inline]
> > > inet_rtm_newaddr+0x3b0/0x18b0 net/ipv4/devinet.c:978
> > > rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6954
> > > netlink_rcv_skb+0x205/0x470 net/netlink/af_netlink.c:2552
> > > netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
> > > netlink_unicast+0x82f/0x9e0 net/netlink/af_netlink.c:1346
> > > netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896
> > > sock_sendmsg_nosec net/socket.c:727 [inline]
> > > __sock_sendmsg+0x21c/0x270 net/socket.c:742
> > > __sys_sendto+0x3bd/0x520 net/socket.c:2244
> > > __do_sys_sendto net/socket.c:2251 [inline]
> > > __se_sys_sendto net/socket.c:2247 [inline]
> > > __x64_sys_sendto+0xde/0x100 net/socket.c:2247
> > > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> > > do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
> > > entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > RIP: 0033:0x7faade790d5c
> > > Code: 2a 5f 02 00 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 70 5f 02 00 48 8b
> > > RSP: 002b:00007ffdd2e3b670 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
> > > RAX: ffffffffffffffda RBX: 00007faadf514620 RCX: 00007faade790d5c
> > > RDX: 0000000000000028 RSI: 00007faadf514670 RDI: 0000000000000003
> > > RBP: 0000000000000000 R08: 00007ffdd2e3b6c4 R09: 000000000000000c
> > > R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003
> > > R13: 0000000000000000 R14: 00007faadf514670 R15: 0000000000000000
> > > </TASK>
> > > rcu: rcu_preempt kthread timer wakeup didn't happen for 10499 jiffies! g11321 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
> > > rcu: Possible timer handling issue on cpu=0 timer-softirq=4286
> > > rcu: rcu_preempt kthread starved for 10500 jiffies! g11321 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
> > > rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
> > > rcu: RCU grace-period kthread stack dump:
> > > task:rcu_preempt state:I stack:27224 pid:16 tgid:16 ppid:2 task_flags:0x208040 flags:0x00080000
> > > Call Trace:
> > > <TASK>
> > > context_switch kernel/sched/core.c:5325 [inline]
> > > __schedule+0x1798/0x4cc0 kernel/sched/core.c:6929
> > > __schedule_loop kernel/sched/core.c:7011 [inline]
> > > schedule+0x165/0x360 kernel/sched/core.c:7026
> > > schedule_timeout+0x12b/0x270 kernel/time/sleep_timeout.c:99
> > > rcu_gp_fqs_loop+0x301/0x1540 kernel/rcu/tree.c:2083
> > > rcu_gp_kthread+0x99/0x390 kernel/rcu/tree.c:2285
> > > kthread+0x711/0x8a0 kernel/kthread.c:463
> > > ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
> > > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> > > </TASK>
> > >
> > >
> > > ---
> > > If you want syzbot to run the reproducer, reply with:
> > > #syz test: git://repo/address.git branch-or-commit-hash
> > > If you attach or paste a git patch, syzbot will apply it before testing.
> >
> > Yet another taprio report.
> >
> > If taprio can not be fixed, perhaps we should remove it from the
> > kernel, or clearly marked as broken.
> > (Then ask syzbot to no longer include it)
>
> Agreed on the challenge with taprio.
> We need the stakeholders input: Vinicius - are you still working in
> this space? Vladimir you also seem to have interest (or maybe nxp
> does) in this?
+ Vladmir..
> At a minimum, we should mark it as broken unless the stakeholders want
> to actively fix these issues.
> Would syzbot still look at it if it was marked broken?
>
> cheers,
> jamal
Powered by blists - more mailing lists