lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e12bfe8-a3c2-47b2-8a3f-5b61b0bd5c08@paulmck-laptop>
Date: Fri, 19 Apr 2024 00:12:00 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Z qiang <qiang.zhang1211@...il.com>
Cc: syzbot <syzbot+dce04ed6d1438ad69656@...kaller.appspotmail.com>,
	linux-kernel@...r.kernel.org, peterz@...radead.org,
	syzkaller-bugs@...glegroups.com, tglx@...utronix.de
Subject: Re: [syzbot] [kernel?] WARNING: suspicious RCU usage in __do_softirq

On Fri, Apr 19, 2024 at 01:50:02PM +0800, Z qiang wrote:
> >
> > On Thu, Apr 18, 2024 at 05:49:38PM +0800, Z qiang wrote:
> > > >
> > > > On Wed, Apr 17, 2024 at 10:25:01AM +0800, Z qiang wrote:
> > > > > >
> > > > > > On Tue, Apr 16, 2024 at 04:44:54PM +0800, Z qiang wrote:
> > > > > > > On Tue, Apr 16, 2024 at 4:10 PM Z qiang <qiang.zhang1211@...il.com> wrote:
> > > > > > > >
> > > > > > > > Cc: Paul
> > > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > syzbot found the following issue on:
> > > > > > > > >
> > > > > > > > > HEAD commit:    c0b832517f62 Add linux-next specific files for 20240402
> > > > > > > > > git tree:       linux-next
> > > > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f64776180000
> > > > > > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=afcaf46d374cec8c
> > > > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=dce04ed6d1438ad69656
> > > > > > > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > > > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10f00471180000
> > > > > > > > >
> > > > > > > > > Downloadable assets:
> > > > > > > > > disk image: https://storage.googleapis.com/syzbot-assets/0d36ec76edc7/disk-c0b83251.raw.xz
> > > > > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/6f9bb4e37dd0/vmlinux-c0b83251.xz
> > > > > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/2349287b14b7/bzImage-c0b83251.xz
> > > > > > > > >
> > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > > > > Reported-by: syzbot+dce04ed6d1438ad69656@...kaller.appspotmail.com
> > > > > > > > >
> > > > > > > > > =============================
> > > > > > > > > WARNING: suspicious RCU usage
> > > > > > > > > 6.9.0-rc2-next-20240402-syzkaller #0 Not tainted
> > > > > > > > > -----------------------------
> > > > > > > > > kernel/rcu/tree.c:276 Illegal rcu_softirq_qs() in RCU read-side critical section!
> > > > > > > > >
> > > > > > > > > other info that might help us debug this:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > rcu_scheduler_active = 2, debug_locks = 1
> > > > > > > > > 1 lock held by ksoftirqd/0/16:
> > > > > > > > >  #0: ffffffff8e334d20 (rcu_read_lock_sched){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline]
> > > > > > > > >  #0: ffffffff8e334d20 (rcu_read_lock_sched){....}-{1:2}, at: rcu_read_lock_sched include/linux/rcupdate.h:933 [inline]
> > > > > > > > >  #0: ffffffff8e334d20 (rcu_read_lock_sched){....}-{1:2}, at: pfn_valid include/linux/mmzone.h:2019 [inline]
> > > > > > > > >  #0: ffffffff8e334d20 (rcu_read_lock_sched){....}-{1:2}, at: __virt_addr_valid+0x183/0x520 arch/x86/mm/physaddr.c:65
> > > > > > > > >
> > > > > > > > > stack backtrace:
> > > > > > > > > CPU: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.9.0-rc2-next-20240402-syzkaller #0
> > > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
> > > > > > > > > Call Trace:
> > > > > > > > >  <IRQ>
> > > > > > > > >  __dump_stack lib/dump_stack.c:88 [inline]
> > > > > > > > >  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> > > > > > > > >  lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712
> > > > > > > > >  rcu_softirq_qs+0xd9/0x370 kernel/rcu/tree.c:273
> > > > > > > > >  __do_softirq+0x5fd/0x980 kernel/softirq.c:568
> > > > > >
> > > > > > Huh.  This statement is supposed to prevent this call to __do_softirq()
> > > > > > from interrupt exit::
> > > > > >
> > > > > >         if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
> > > > > >             __this_cpu_read(ksoftirqd) == current)
> > > > > >
> > > > > > So was the ksoftirqd kthread interrupted at a point where it happens to
> > > > > > have softirq enabled?
> > > > >
> > > > > It should look like this:
> > > > > schedule()
> > > > > switch_to ksoftirqd/0
> > > > > finish_task_switch
> > > >
> > > > So this CPU's ksoftirqd task is running.
> > > >
> > > > > ->put_task_struct_rcu_user
> > > > >    ->call_rcu(&task->rcu, delayed_put_task_struct)
> > > > >       ->__kasan_record_aux_stack
> > > > >          ->pfn_valid
> > > > >             ->rcu_read_lock_sched
> > > > >                 <interrupt>
> > > > >                  __irq_exit_rcu
> > > > >                  ->__do_softirq
> > > > >                     -> if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
> > > > >                              __this_cpu_read(ksoftirqd) == current)
> > > >
> > > > But we are also running __do_softirq() from return from interrupt.  While
> > > > running in this mode, we are not supposed to invoke rcu_softirq_qs().
> > > > But the "__this_cpu_read(ksoftirqd) == current" check yields "true",
> > > > so we do call rcu_softirq_qs() anyway.  That is a bug.
> > > >
> > > > We need to upgrade or replace that check to something that returns true
> > > > only if called at process level from ksoftirqd.
> > > >
> > > > Any thoughts on a good way to do that?  For example, would adding "&&
> > > > in_task()" do the trick, or are there other unfortunate corner cases?
> > >
> > > The rcu_softirq_qs() is invoked in softirq_handle_begin/end() critical section,
> > > in softirqd/0 task context,  the "in_task()" should return false, will miss
> > > qs report in softirqd/0 task context.
> > >
> > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > index b315b21fb28c..9b8f0c0f7675 100644
> > > --- a/kernel/softirq.c
> > > +++ b/kernel/softirq.c
> > > @@ -563,10 +563,6 @@ asmlinkage __visible void __softirq_entry
> > > __do_softirq(void)
> > >                 pending >>= softirq_bit;
> > >         }
> > >
> > > -       if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
> > > -           __this_cpu_read(ksoftirqd) == current)
> > > -               rcu_softirq_qs();
> > > -
> > >         local_irq_disable();
> > >
> > >         pending = local_softirq_pending();
> > > @@ -915,6 +911,8 @@ static int ksoftirqd_should_run(unsigned int cpu)
> > >
> > >  static void run_ksoftirqd(unsigned int cpu)
> > >  {
> > > +       unsigned long last_qs = jiffies;
> > > +
> > >         ksoftirqd_run_begin();
> > >         if (local_softirq_pending()) {
> > >                 /*
> > > @@ -923,6 +921,7 @@ static void run_ksoftirqd(unsigned int cpu)
> > >                  */
> > >                 __do_softirq();
> > >                 ksoftirqd_run_end();
> > > +               rcu_softirq_qs_periodic(last_qs);
> >
> > Unfortunately, we need the quiescent state to be within __do_softirq().
> >
> > >                 cond_resched();
> > >                 return;
> > >         }
> > >
> > > Any thoughts?
> >
> > Can we mask the return value from preempt_count(), for example, as shown
> > in the CONFIG_PREEMPTION=n version of rcu_flavor_sched_clock_irq()?
> >
> > The trick is that we should be able to ignore SOFTIRQ_MASK because
> > __do_softirq() should not be invoked when softirqs are disabled.
> > Emphasis on "should".  ;-)
> 
> 
> 
> Does it mean this?
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index b315b21fb28c..315b717ec944 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -517,6 +517,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>         bool in_hardirq;
>         __u32 pending;
>         int softirq_bit;
> +       bool preempt_enabled = (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
> +                                       !(preempt_count() & PREEMPT_MASK));
> 
>         /*
>          * Mask out PF_MEMALLOC as the current task context is borrowed for the
> @@ -564,7 +566,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>         }
> 
>         if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
> -           __this_cpu_read(ksoftirqd) == current)
> +           __this_cpu_read(ksoftirqd) == current &&
> +           preempt_enabled && !rcu_preempt_depth())
>                 rcu_softirq_qs();
> 
>         local_irq_disable();
> 
> For built with CONFIG_PREEMPTION=n and CONFIG_COUNT=n kernels,
> the preempt_enabled is always  false.

Assuming that this works, it is preferred over the next one.

							Thanx, Paul

> or how about this?
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index b315b21fb28c..80b9fb4ee562 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -66,6 +66,8 @@ const char * const softirq_to_name[NR_SOFTIRQS] = {
>         "TASKLET", "SCHED", "HRTIMER", "RCU"
>  };
> 
> +static DEFINE_PER_CPU(bool, ksoftirqd_work);
> +
>  /*
>   * we cannot loop indefinitely here to avoid userspace starvation,
>   * but we also don't want to introduce a worst case 1/HZ latency
> @@ -404,10 +406,12 @@ static inline void softirq_handle_end(void)
>  static inline void ksoftirqd_run_begin(void)
>  {
>         local_irq_disable();
> +       per_cpu(ksoftirqd_work, smp_processor_id()) = true;
>  }
> 
>  static inline void ksoftirqd_run_end(void)
>  {
> +       per_cpu(ksoftirqd_work, smp_processor_id()) = false;
>         local_irq_enable();
>  }
> 
> @@ -564,7 +568,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>         }
> 
>         if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
> -           __this_cpu_read(ksoftirqd) == current)
> +           __this_cpu_read(ksoftirqd) == current && ksoftirqd_work)
>                 rcu_softirq_qs();
> 
>         local_irq_disable();
> @@ -970,6 +974,10 @@ static struct smp_hotplug_thread softirq_threads = {
> 
>  static __init int spawn_ksoftirqd(void)
>  {
> +       int cpu;
> +
> +       for_each_possible_cpu(cpu)
> +               per_cpu(ksoftirqd_work, cpu) = false;
>         cpuhp_setup_state_nocalls(CPUHP_SOFTIRQ_DEAD, "softirq:dead", NULL,
>                                   takeover_tasklets);
>         BUG_ON(smpboot_register_percpu_thread(&softirq_threads));
> 
> 
> Thanks
> Zqiang
> 
> 
> 
> >
> >                                                         Thanx, Paul
> >
> > > Thanks
> > > Zqiang
> > >
> > >
> > > >
> > > > (And good job tracking this down, by the way!)
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > >                               rcu_softirq_qs
> > > > >                                ->
> > > > > RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map))
> > > > >
> > > > > Thanks
> > > > > Zqiang
> > > > >
> > > > >
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > > >  invoke_softirq kernel/softirq.c:428 [inline]
> > > > > > > > >  __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
> > > > > > > > >  irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
> > > > > > > > >  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
> > > > > > > > >  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
> > > > > > > > >  </IRQ>
> > > > > > > > >  <TASK>
> > > > > > > > >  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> > > > > > > > > RIP: 0010:debug_lockdep_rcu_enabled+0xd/0x40 kernel/rcu/update.c:320
> > > > > > > > > Code: f5 90 0f 0b 90 90 90 eb c6 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 31 c0 83 3d c7 0f 28 04 00 <74> 1e 83 3d 26 42 28 04 00 74 15 65 48 8b 0c 25 c0 d3 03 00 31 c0
> > > > > > > > > RSP: 0018:ffffc90000157a50 EFLAGS: 00000202
> > > > > > > > > RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000001
> > > > > > > > > RDX: dffffc0000000000 RSI: ffffffff8bcae740 RDI: ffffffff8c1f7ec0
> > > > > > > > > RBP: dffffc0000000000 R08: ffffffff92f3a527 R09: 1ffffffff25e74a4
> > > > > > > > > R10: dffffc0000000000 R11: fffffbfff25e74a5 R12: 0000000029373578
> > > > > > > > > R13: 1ffff9200002af64 R14: ffffffff814220f3 R15: ffff88813fff90a0
> > > > > > > > >  rcu_read_lock_sched include/linux/rcupdate.h:934 [inline]
> > > > > > > > >  pfn_valid include/linux/mmzone.h:2019 [inline]
> > > > > > > > >  __virt_addr_valid+0x1a9/0x520 arch/x86/mm/physaddr.c:65
> > > > > > > > >  kasan_addr_to_slab+0xd/0x80 mm/kasan/common.c:37
> > > > > > > > >  __kasan_record_aux_stack+0x11/0xc0 mm/kasan/generic.c:526
> > > > > > > >
> > > > > > > >
> > > > > > > > This should be caused by the following commit:
> > > > > > > > d818cc76e2b4 ("kasan: Record work creation stack trace with interrupts enabled")
> > > > > > > >
> > > > > > > > Is it possible to make the rcu_softirq_qs() run only in ksoftirqd task?
> > > > > > >
> > > > > > > use rcu_softirq_qs_periodic() in run_ksoftirqd().
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > > Zqiang
> > > > > > > >
> > > > > > > > >  __call_rcu_common kernel/rcu/tree.c:3096 [inline]
> > > > > > > > >  call_rcu+0x167/0xa70 kernel/rcu/tree.c:3200
> > > > > > > > >  context_switch kernel/sched/core.c:5412 [inline]
> > > > > > > > >  __schedule+0x17f0/0x4a50 kernel/sched/core.c:6746
> > > > > > > > >  __schedule_loop kernel/sched/core.c:6823 [inline]
> > > > > > > > >  schedule+0x14b/0x320 kernel/sched/core.c:6838
> > > > > > > > >  smpboot_thread_fn+0x61e/0xa30 kernel/smpboot.c:160
> > > > > > > > >  kthread+0x2f0/0x390 kernel/kthread.c:388
> > > > > > > > >  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> > > > > > > > >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
> > > > > > > > >  </TASK>
> > > > > > > > > ----------------
> > > > > > > > > Code disassembly (best guess):
> > > > > > > > >    0:   f5                      cmc
> > > > > > > > >    1:   90                      nop
> > > > > > > > >    2:   0f 0b                   ud2
> > > > > > > > >    4:   90                      nop
> > > > > > > > >    5:   90                      nop
> > > > > > > > >    6:   90                      nop
> > > > > > > > >    7:   eb c6                   jmp    0xffffffcf
> > > > > > > > >    9:   0f 1f 40 00             nopl   0x0(%rax)
> > > > > > > > >    d:   90                      nop
> > > > > > > > >    e:   90                      nop
> > > > > > > > >    f:   90                      nop
> > > > > > > > >   10:   90                      nop
> > > > > > > > >   11:   90                      nop
> > > > > > > > >   12:   90                      nop
> > > > > > > > >   13:   90                      nop
> > > > > > > > >   14:   90                      nop
> > > > > > > > >   15:   90                      nop
> > > > > > > > >   16:   90                      nop
> > > > > > > > >   17:   90                      nop
> > > > > > > > >   18:   90                      nop
> > > > > > > > >   19:   90                      nop
> > > > > > > > >   1a:   90                      nop
> > > > > > > > >   1b:   90                      nop
> > > > > > > > >   1c:   90                      nop
> > > > > > > > >   1d:   f3 0f 1e fa             endbr64
> > > > > > > > >   21:   31 c0                   xor    %eax,%eax
> > > > > > > > >   23:   83 3d c7 0f 28 04 00    cmpl   $0x0,0x4280fc7(%rip)        # 0x4280ff1
> > > > > > > > > * 2a:   74 1e                   je     0x4a <-- trapping instruction
> > > > > > > > >   2c:   83 3d 26 42 28 04 00    cmpl   $0x0,0x4284226(%rip)        # 0x4284259
> > > > > > > > >   33:   74 15                   je     0x4a
> > > > > > > > >   35:   65 48 8b 0c 25 c0 d3    mov    %gs:0x3d3c0,%rcx
> > > > > > > > >   3c:   03 00
> > > > > > > > >   3e:   31 c0                   xor    %eax,%eax
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ---
> > > > > > > > > This report is generated by a bot. It may contain errors.
> > > > > > > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > > > > > > syzbot engineers can be reached at syzkaller@...glegroups.com.
> > > > > > > > >
> > > > > > > > > syzbot will keep track of this issue. See:
> > > > > > > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > > > > > > >
> > > > > > > > > If the report is already addressed, let syzbot know by replying with:
> > > > > > > > > #syz fix: exact-commit-title
> > > > > > > > >
> > > > > > > > > If you want syzbot to run the reproducer, reply with:
> > > > > > > > > #syz test: git://repo/address.git branch-or-commit-hash
> > > > > > > > > If you attach or paste a git patch, syzbot will apply it before testing.
> > > > > > > > >
> > > > > > > > > If you want to overwrite report's subsystems, reply with:
> > > > > > > > > #syz set subsystems: new-subsystem
> > > > > > > > > (See the list of subsystem names on the web dashboard)
> > > > > > > > >
> > > > > > > > > If the report is a duplicate of another one, reply with:
> > > > > > > > > #syz dup: exact-subject-of-another-report
> > > > > > > > >
> > > > > > > > > If you want to undo deduplication, reply with:
> > > > > > > > > #syz undup
> > > > > > > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ