lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 19 Mar 2020 10:41:44 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...com, mingo@...nel.org, jiangshanlai@...il.com,
        dipankar@...ibm.com, akpm@...ux-foundation.org,
        mathieu.desnoyers@...icios.com, josh@...htriplett.org,
        tglx@...utronix.de, peterz@...radead.org, dhowells@...hat.com,
        edumazet@...gle.com, fweisbec@...il.com, oleg@...hat.com,
        joel@...lfernandes.org
Subject: Re: [PATCH RFC v2 tip/core/rcu 02/22] rcu: Add per-task state to RCU
 CPU stall warnings

On Thu, Mar 19, 2020 at 01:27:31PM -0400, Steven Rostedt wrote:
> On Wed, 18 Mar 2020 17:10:40 -0700
> paulmck@...nel.org wrote:
> 
> > From: "Paul E. McKenney" <paulmck@...nel.org>
> > 
> > Currently, an RCU-preempt CPU stall warning simply lists the PIDs of
> > those tasks holding up the current grace period.  This can be helpful,
> > but more can be even more helpful.
> > 
> > To this end, this commit adds the nesting level, whether the task
> > things it was preempted in its current RCU read-side critical section,
> 
> s/things/thinks/

I thing that was an excellent catch, thank you!  ;-)

> > whether RCU core has asked this task for a quiescent state, whether the
> > expedited-grace-period hint is set, and whether the task believes that
> > it is on the blocked-tasks list (it must be, or it would not be printed,
> > but if things are broken, best not to take too much for granted).
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > ---
> >  kernel/rcu/tree_stall.h | 38 ++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 36 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index 502b4dd..e19487d 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -192,14 +192,40 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
> >  	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> >  }
> >  
> > +// Communicate task state back to the RCU CPU stall warning request.
> > +struct rcu_stall_chk_rdr {
> > +	int nesting;
> > +	union rcu_special rs;
> > +	bool on_blkd_list;
> > +};
> > +
> > +/*
> > + * Report out the state of a not-running task that is stalling the
> > + * current RCU grace period.
> > + */
> > +static bool check_slow_task(struct task_struct *t, void *arg)
> > +{
> > +	struct rcu_node *rnp;
> > +	struct rcu_stall_chk_rdr *rscrp = arg;
> > +
> > +	if (task_curr(t))
> > +		return false; // It is running, so decline to inspect it.
> 
> Since it can be locked on_rq(), should we report that too?

If it is locked on_rq() but !task_curr(t), it is runnable but not running.
Because the runqueue lock is held in that case, it cannot start running,
so the remainder of this function can safely inspect its state.  The
runqueue locks will supply the required ordering, ensuring a consistent
snapshot of the task's state.

However, if it is task_curr(t), which implies on_rq() as I understand
it, the task is running and therefore might be changing its state, and
doing so without any sort of attention to synchronization.  After all,
it is the task's private state that it is changing, so we don't want to
be paying the cost of any synchronization anyway.  Hence the return of
false above.

Or am I missing your point?

							Thanx, Paul

> -- Steve
> 
> > +	rscrp->nesting = t->rcu_read_lock_nesting;
> > +	rscrp->rs = t->rcu_read_unlock_special;
> > +	rnp = t->rcu_blocked_node;
> > +	rscrp->on_blkd_list = !list_empty(&t->rcu_node_entry);
> > +	return true;
> > +}
> > +
> >  /*
> >   * Scan the current list of tasks blocked within RCU read-side critical
> >   * sections, printing out the tid of each.
> >   */
> >  static int rcu_print_task_stall(struct rcu_node *rnp)
> >  {
> > -	struct task_struct *t;
> >  	int ndetected = 0;
> > +	struct rcu_stall_chk_rdr rscr;
> > +	struct task_struct *t;
> >  
> >  	if (!rcu_preempt_blocked_readers_cgp(rnp))
> >  		return 0;
> > @@ -208,7 +234,15 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
> >  	t = list_entry(rnp->gp_tasks->prev,
> >  		       struct task_struct, rcu_node_entry);
> >  	list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
> > -		pr_cont(" P%d", t->pid);
> > +		if (!try_invoke_on_locked_down_task(t, check_slow_task, &rscr))
> > +			pr_cont(" P%d", t->pid);
> > +		else
> > +			pr_cont(" P%d/%d:%c%c%c%c",
> > +				t->pid, rscr.nesting,
> > +				".b"[rscr.rs.b.blocked],
> > +				".q"[rscr.rs.b.need_qs],
> > +				".e"[rscr.rs.b.exp_hint],
> > +				".l"[rscr.on_blkd_list]);
> >  		ndetected++;
> >  	}
> >  	pr_cont("\n");
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ