linux-kernel - Re: Scenario TREE07 with CONFIG_PREEMPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220311113256.GB96127@lothringen>
Date:   Fri, 11 Mar 2022 12:32:56 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: Scenario TREE07 with CONFIG_PREEMPT_DYNAMIC=n?

On Thu, Mar 10, 2022 at 02:52:19PM -0800, Paul E. McKenney wrote:
> On Thu, Mar 10, 2022 at 11:41:03PM +0100, Frederic Weisbecker wrote:
> > On Thu, Mar 10, 2022 at 01:56:30PM -0800, Paul E. McKenney wrote:
> > > Hello, Frederic,
> > > 
> > > I recently added CONFIG_PREEMPT_DYNAMIC=n to the TREE07 file, and since
> > > then am getting roughly one RCU CPU stall warning (or silent hang)
> > > per few tens of hours of rcutorture testing on dual-socket systems.
> > > The stall warnings feature starvation of RCU grace-period kthread.
> > > 
> > > Any advice on debugging this?
> > 
> > Oh, I'm testing that!
> 
> Even better, thank you!  ;-)

One possibly interesting detail: the stalling CPU is stuck on
sync_rcu_do_polled_gp(), which is launched by the recent start_poll_synchronize_rcu_expedited():

[  463.518410]  <IRQ>
[  463.518691]  dump_stack_lvl+0x33/0x42
[  463.519182]  nmi_cpu_backtrace+0xc0/0xe0
[  463.519706]  ? lapic_can_unplug_cpu+0x90/0x90
[  463.522188]  nmi_trigger_cpumask_backtrace+0x82/0xc0
[  463.522863]  rcu_dump_cpu_stacks+0xc0/0xf0
[  463.523410]  rcu_sched_clock_irq+0x6e3/0xa30
[  463.523982]  ? tick_sched_handle.isra.21+0x40/0x40
[  463.524628]  update_process_times+0x87/0xb0
[  463.525183]  tick_sched_handle.isra.21+0x2b/0x40
[  463.525795]  tick_sched_timer+0x5e/0x70
[  463.526306]  __hrtimer_run_queues+0x108/0x240
[  463.526886]  hrtimer_interrupt+0xe0/0x240
[  463.527419]  __sysvec_apic_timer_interrupt+0x55/0xf0
[  463.528436]  sysvec_apic_timer_interrupt+0x43/0x80
[  463.529074]  </IRQ>
[  463.529366]  <TASK>
[  463.529656]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  463.530333] RIP: 0010:synchronize_rcu_expedited+0x0/0x3f0
[  463.531042] Code: 89 65 48 48 89 ef e8 8f 03 af 00 48 8b 85 80 00 00 00 48 85 c0 0f 84 70 ff ff ff 4c 8b 65 68 48 89 c5 eb b6 66 0f 1f 44 00 00 <41> 54 55 53 48 81 ec 90 00 00 00 44 8b 25 3a 10 7a 01 65 48 8b 04
[  463.533454] RSP: 0018:ffffb2e94045be88 EFLAGS: 00000287
[  463.534141] RAX: 000000000001300c RBX: 0000000000013010 RCX: ffff98941f2298e8
[  463.535078] RDX: 0000000000013011 RSI: 0000000000000286 RDI: ffffffff9bb3ba9c
[  463.536007] RBP: ffffffff9bb3baa8 R08: 000070675f756372 R09: 8080808080808080
[  463.536948] R10: ffffb2e940077d48 R11: fefefefefefefeff R12: ffffffff9bb3ba9c
[  463.537878] R13: 0000000000000000 R14: ffff98941f2298c0 R15: ffff98941f22e005
[  463.538810]  sync_rcu_do_polled_gp+0x39/0xc0
[  463.539382]  process_one_work+0x1ec/0x3b0
[  463.539917]  worker_thread+0x25/0x390
[  463.548472]  ? process_one_work+0x3b0/0x3b0
[  463.549193]  kthread+0xbd/0xe0
[  463.549730]  ? kthread_complete_and_exit+0x20/0x20
[  463.550548]  ret_from_fork+0x22/0x30
[  463.551167]  </TASK>


Could be this:

	while (!sync_exp_work_done(s))
		synchronize_rcu_expedited();

And if synchronize_rcu_expedited() -> rcu_blocking_is_gp() is true, then we
may run in a long loop withing the workqueue without a chance to report a QS,
expecially if we are running a non-preemptible kernel. This could be the cause
of the stalling grace period kthread.

rcu_blocking_is_gp() could be true if all other CPUs but the current one are
offline.

It's just a potential scenario, lemme check...