lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 11 Apr 2017 20:23:07 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: There is a Tasks RCU stall warning

On Tue, Apr 11, 2017 at 04:11:38PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 11, 2017 at 04:04:45PM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 11, 2017 at 04:01:54PM -0700, Paul E. McKenney wrote:
> > > On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2017 14:56:56 -0700
> > > > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> > > > 
> > > > > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > > > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > > > > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> > > > > > 
> > > > > >   
> > > > > > > Works for me!
> > > > > > > 
> > > > > > > Hopefully it will also work for your computer.  :-)
> > > > > > > 
> > > > > > > And whew!  Glad to see that the stall warnings worked!  
> > > > > > 
> > > > > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > > > > 
> > > > > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > > > > needs to schedule out. With my updated code, I just triggered:
> > > > > > 
> > > > > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > > > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > > > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > > > > [  196.302746] Call Trace:
> > > > > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > > > > [  196.314453]  __schedule+0x222/0x1210
> > > > > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > > > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > > > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > > > > [  196.340384]  schedule+0x57/0xe0
> > > > > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > > > > [  196.352823]  kthread+0x178/0x1d0
> > > > > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > > > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > > > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > > > > 
> > > > > > 
> > > > > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > > > > and it still fails on rcu_tasks.  
> > > > > 
> > > > > Good point!
> > > > > 
> > > > > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > > > > called for preemption as well as for voluntary context switches.
> > > > 
> > > > If you pass in the "preempt" variable, it would work. It's false when
> > > > __schedule() was called by a "schedule()" directly, and true when
> > > > called by one of the preempt_schedule() functions.
> > > 
> > > Like this?  (Untested, but builds at least some of the time.)
> > 
> > Not like that...  :-/  Update on its way.
> 
> Perhaps more like this.  Started rcutorture on it, will see how it goes.

And it passed moderate rcutorture testing, for whatever that is worth.

But another question...

Suppose someone traced or probed or whatever a call to (say)
cond_resched_rcu_qs().  Wouldn't that put the call to this
function in the trampoline itself?  Of course, if this happened,
life would be hard when the trampoline was freed due to
cond_resched_rcu_qs() being a quiescent state.

Or is there something that takes care to avoid putting calls to
this sort of function (and calls to any function calling this sort
of function, directly or indirectly) into a trampoline?

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> commit c9653896be9b79b7227e8b97710ad6475fc72dc1
> Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> Date:   Tue Apr 11 15:50:41 2017 -0700
> 
>     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>     
>     Currently, a call to schedule() acts as a Tasks RCU quiescent state
>     only if a context switch actually takes place.  However, just the
>     call to schedule() guarantees that the calling task has moved off of
>     whatever tracing trampoline that it might have been one previously.
>     This commit therefore plumbs schedule()'s "preempt" parameter into
>     rcu_note_context_switch(), which then records the Tasks RCU quiescent
>     state, but only if this call to schedule() was -not- due to a preemption.
>     
>     Suggested-by: Steven Rostedt <rostedt@...dmis.org>
>     Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e6146d0074f8..f531b29207da 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
>  #ifdef CONFIG_TASKS_RCU
>  #define TASKS_RCU(x) x
>  extern struct srcu_struct tasks_rcu_exit_srcu;
> -#define rcu_note_voluntary_context_switch(t) \
> +#define rcu_note_voluntary_context_switch_lite(t) \
>  	do { \
> -		rcu_all_qs(); \
>  		if (READ_ONCE((t)->rcu_tasks_holdout)) \
>  			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
>  	} while (0)
> +#define rcu_note_voluntary_context_switch(t) \
> +	do { \
> +		rcu_all_qs(); \
> +		rcu_note_voluntary_context_switch_lite(t); \
> +	} while (0)
>  #else /* #ifdef CONFIG_TASKS_RCU */
>  #define TASKS_RCU(x) do { } while (0)
> -#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
> +#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
> +#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
>  #endif /* #else #ifdef CONFIG_TASKS_RCU */
>  
>  /**
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 5219be250f00..21ea52df651d 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
>  	call_rcu(head, func);
>  }
>  
> -static inline void rcu_note_context_switch(void)
> -{
> -	rcu_sched_qs();
> -}
> +#define rcu_note_context_switch(preempt) \
> +	do { \
> +		rcu_sched_qs(); \
> +		rcu_note_voluntary_context_switch_lite(current); \
> +	} while (0)
>  
>  /*
>   * Take advantage of the fact that there is only one CPU, which
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..04a2caf3ab8b 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -30,7 +30,7 @@
>  #ifndef __LINUX_RCUTREE_H
>  #define __LINUX_RCUTREE_H
>  
> -void rcu_note_context_switch(void);
> +void rcu_note_context_switch(bool preempt);
>  int rcu_needs_cpu(u64 basem, u64 *nextevt);
>  void rcu_cpu_stall_reset(void);
>  
> @@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
>   */
>  static inline void rcu_virt_note_context_switch(int cpu)
>  {
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(false);
>  }
>  
>  void synchronize_rcu_bh(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f13fc4ab2f0d..92cf78fffd31 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -458,13 +458,15 @@ static void rcu_momentary_dyntick_idle(void)
>   * and requires special handling for preemptible RCU.
>   * The caller must have disabled interrupts.
>   */
> -void rcu_note_context_switch(void)
> +void rcu_note_context_switch(bool preempt)
>  {
>  	struct rcu_state *rsp;
>  
>  	barrier(); /* Avoid RCU read-side critical sections leaking down. */
>  	trace_rcu_utilization(TPS("Start context switch"));
>  	rcu_sched_qs();
> +	if (!preempt)
> +		rcu_note_voluntary_context_switch_lite(current);
>  	rcu_preempt_note_context_switch();
>  	/* Load rcu_urgent_qs before other flags. */
>  	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9c38e6be4f3e..653ea8cf89e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
>  		hrtick_clear(rq);
>  
>  	local_irq_disable();
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(preempt);
>  
>  	/*
>  	 * Make sure that signal_pending_state()->signal_pending() below

Powered by blists - more mailing lists