lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 12 Apr 2017 07:48:00 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: There is a Tasks RCU stall warning

On Tue, Apr 11, 2017 at 04:11:38PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 11, 2017 at 04:04:45PM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 11, 2017 at 04:01:54PM -0700, Paul E. McKenney wrote:
> > > On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2017 14:56:56 -0700
> > > > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> > > > 
> > > > > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > > > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > > > > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> > > > > > 
> > > > > >   
> > > > > > > Works for me!
> > > > > > > 
> > > > > > > Hopefully it will also work for your computer.  :-)
> > > > > > > 
> > > > > > > And whew!  Glad to see that the stall warnings worked!  
> > > > > > 
> > > > > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > > > > 
> > > > > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > > > > needs to schedule out. With my updated code, I just triggered:
> > > > > > 
> > > > > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > > > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > > > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > > > > [  196.302746] Call Trace:
> > > > > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > > > > [  196.314453]  __schedule+0x222/0x1210
> > > > > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > > > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > > > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > > > > [  196.340384]  schedule+0x57/0xe0
> > > > > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > > > > [  196.352823]  kthread+0x178/0x1d0
> > > > > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > > > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > > > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > > > > 
> > > > > > 
> > > > > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > > > > and it still fails on rcu_tasks.  
> > > > > 
> > > > > Good point!
> > > > > 
> > > > > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > > > > called for preemption as well as for voluntary context switches.
> > > > 
> > > > If you pass in the "preempt" variable, it would work. It's false when
> > > > __schedule() was called by a "schedule()" directly, and true when
> > > > called by one of the preempt_schedule() functions.
> > > 
> > > Like this?  (Untested, but builds at least some of the time.)
> > 
> > Not like that...  :-/  Update on its way.
> 
> Perhaps more like this.  Started rcutorture on it, will see how it goes.

Do you need this patch?  If so, I should do some more work on it to
eliminate the extra common-case branch on the scheduler fastpath.

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> commit c9653896be9b79b7227e8b97710ad6475fc72dc1
> Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> Date:   Tue Apr 11 15:50:41 2017 -0700
> 
>     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>     
>     Currently, a call to schedule() acts as a Tasks RCU quiescent state
>     only if a context switch actually takes place.  However, just the
>     call to schedule() guarantees that the calling task has moved off of
>     whatever tracing trampoline that it might have been one previously.
>     This commit therefore plumbs schedule()'s "preempt" parameter into
>     rcu_note_context_switch(), which then records the Tasks RCU quiescent
>     state, but only if this call to schedule() was -not- due to a preemption.
>     
>     Suggested-by: Steven Rostedt <rostedt@...dmis.org>
>     Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e6146d0074f8..f531b29207da 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
>  #ifdef CONFIG_TASKS_RCU
>  #define TASKS_RCU(x) x
>  extern struct srcu_struct tasks_rcu_exit_srcu;
> -#define rcu_note_voluntary_context_switch(t) \
> +#define rcu_note_voluntary_context_switch_lite(t) \
>  	do { \
> -		rcu_all_qs(); \
>  		if (READ_ONCE((t)->rcu_tasks_holdout)) \
>  			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
>  	} while (0)
> +#define rcu_note_voluntary_context_switch(t) \
> +	do { \
> +		rcu_all_qs(); \
> +		rcu_note_voluntary_context_switch_lite(t); \
> +	} while (0)
>  #else /* #ifdef CONFIG_TASKS_RCU */
>  #define TASKS_RCU(x) do { } while (0)
> -#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
> +#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
> +#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
>  #endif /* #else #ifdef CONFIG_TASKS_RCU */
>  
>  /**
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 5219be250f00..21ea52df651d 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
>  	call_rcu(head, func);
>  }
>  
> -static inline void rcu_note_context_switch(void)
> -{
> -	rcu_sched_qs();
> -}
> +#define rcu_note_context_switch(preempt) \
> +	do { \
> +		rcu_sched_qs(); \
> +		rcu_note_voluntary_context_switch_lite(current); \
> +	} while (0)
>  
>  /*
>   * Take advantage of the fact that there is only one CPU, which
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..04a2caf3ab8b 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -30,7 +30,7 @@
>  #ifndef __LINUX_RCUTREE_H
>  #define __LINUX_RCUTREE_H
>  
> -void rcu_note_context_switch(void);
> +void rcu_note_context_switch(bool preempt);
>  int rcu_needs_cpu(u64 basem, u64 *nextevt);
>  void rcu_cpu_stall_reset(void);
>  
> @@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
>   */
>  static inline void rcu_virt_note_context_switch(int cpu)
>  {
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(false);
>  }
>  
>  void synchronize_rcu_bh(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f13fc4ab2f0d..92cf78fffd31 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -458,13 +458,15 @@ static void rcu_momentary_dyntick_idle(void)
>   * and requires special handling for preemptible RCU.
>   * The caller must have disabled interrupts.
>   */
> -void rcu_note_context_switch(void)
> +void rcu_note_context_switch(bool preempt)
>  {
>  	struct rcu_state *rsp;
>  
>  	barrier(); /* Avoid RCU read-side critical sections leaking down. */
>  	trace_rcu_utilization(TPS("Start context switch"));
>  	rcu_sched_qs();
> +	if (!preempt)
> +		rcu_note_voluntary_context_switch_lite(current);
>  	rcu_preempt_note_context_switch();
>  	/* Load rcu_urgent_qs before other flags. */
>  	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9c38e6be4f3e..653ea8cf89e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
>  		hrtick_clear(rq);
>  
>  	local_irq_disable();
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(preempt);
>  
>  	/*
>  	 * Make sure that signal_pending_state()->signal_pending() below

Powered by blists - more mailing lists