lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150908195915.GX4029@linux.vnet.ibm.com>
Date:	Tue, 8 Sep 2015 12:59:15 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Petr Mladek <pmladek@...e.com>
Cc:	Josh Triplett <josh@...htriplett.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Jiri Kosina <jkosina@...e.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] rcu: Show the real fqs_state

On Mon, Sep 07, 2015 at 04:58:27PM +0200, Petr Mladek wrote:
> On Fri 2015-09-04 16:24:22, Paul E. McKenney wrote:
> > On Fri, Sep 04, 2015 at 02:11:29PM +0200, Petr Mladek wrote:
> > > The value of "fqs_state" in struct rcu_state is always RCU_GP_IDLE.
> > > 
> > > The real state is stored in a local variable in rcu_gp_kthread().
> > > It is modified by rcu_gp_fqs() via parameter and return value.
> > > But the actual value is never stored to rsp->fqs_state.
> > > 
> > > The result is that print_one_rcu_state() does not show the real
> > > state.
> > > 
> > > This code has been added 3 years ago by the commit 4cdfc175c25c89ee
> > > ("rcu: Move quiescent-state forcing into kthread"). I guess that it
> > > was an overlook or optimization.
> > > 
> > > Anyway, the value seems to be manipulated only by the thread, except
> > > for shoving the status. I do not see any risk in updating it directly
> > > in the struct.
> > > 
> > > Signed-off-by: Petr Mladek <pmladek@...e.com>
> > 
> > Good catch, but how about the following fix instead?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> >     rcu: Finish folding ->fqs_state into ->gp_state
> >     
> >     Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
> >     into kthread") started the process of folding the old ->fqs_state
> >     into ->gp_state, but did not complete it.  This situation does not
> >     cause any malfunction, but can result in extremely confusing trace
> >     output.  This commit completes this task of eliminating ->fqs_state
> >     in favor of ->gp_state.
> 
> It makes sense but it breaks dynticks handling in rcu_gp_fqs(), see
> below.

Indeed, more confusion on my part!

> >     Reported-by: Petr Mladek <pmladek@...e.com>
> >     Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 69ab7ce2cf7b..04234936d897 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1949,16 +1949,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
> >  /*
> >   * Do one round of quiescent-state forcing.
> >   */
> > -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> > +static void rcu_gp_fqs(struct rcu_state *rsp)
> >  {
> > -	int fqs_state = fqs_state_in;
> >  	bool isidle = false;
> >  	unsigned long maxj;
> >  	struct rcu_node *rnp = rcu_get_root(rsp);
> >  
> >  	WRITE_ONCE(rsp->gp_activity, jiffies);
> >  	rsp->n_force_qs++;
> > -	if (fqs_state == RCU_SAVE_DYNTICK) {
> > +	if (rsp->gp_state == RCU_SAVE_DYNTICK) {
> 
> This will never happen because rcu_gp_kthread() modifies rsp->gp_state
> many times. The last value before calling rcu_gp_fqs() is
> RCU_GP_DOING_FQS.
> 
> I think about passing this information via a separate bool.
> 
> [...]
> 
> > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > index d5f58e717c8b..9faad70a8246 100644
> > --- a/kernel/rcu/tree.h
> > +++ b/kernel/rcu/tree.h
> > @@ -417,12 +417,11 @@ struct rcu_data {
> >  	struct rcu_state *rsp;
> >  };
> >  
> > -/* Values for fqs_state field in struct rcu_state. */
> > +/* Values for gp_state field in struct rcu_state. */
> >  #define RCU_GP_IDLE		0	/* No grace period in progress. */
> 
> This value seems to be used instead of the new RCU_GP_WAIT_INIT.
> 
> >  #define RCU_GP_INIT		1	/* Grace period being
> >  #initialized. */
> 
> This value is unused.
> 
> >  #define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick
> >  #state. */
> 
> This one is not longer preserved when merged with the other state.
> 
> >  #define RCU_FORCE_QS		3	/* Need to force quiescent
> >  #state. */
> 
> The meaning of this one is strange. If I get it correctly,
> it is set after the state was forced. But the comment suggests
> that it is before.
> 
> By other words, these states seems to get obsoleted by
> 
> /* Values for rcu_state structure's gp_flags field. */
> #define RCU_GP_WAIT_INIT 0	/* Initial state. */
> #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
> #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
> #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
> #define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
> #define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
> #define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */
> 
> 
> Please, find below your commit updated with my ideas:
> 
> 	+ used bool save_dyntick instead of RCU_SAVE_DYNTICK
> 	  and RCU_FORCE_QS states
> 	+ rename RCU_GP_WAIT_INIT -> RCU_GP_IDLE
> 	+ remove all the obsolete states
> 
> I am sorry if I handled "Signed-off-by" flags a wrong way. It is
> basically your patch with few small updates from me. I am not sure
> what is the right process in this case. Feel free to use Reviewed-by
> instead of Signed-off-by with my name.
> 
> Well, I guess that this is not the final state ;-)

Good points, but perhaps an easier solution would be to have a
"firsttime" argument to rcu_gp_fqs() that said whether or not this
was the first call to rcu_gp_fqs() during the current grace period.
If this is the first call, then take the "if" branch that passes
dyntick_save_progress_counter() to force_qs_rnp(), otherwise take the
other branch.

An alternative approach would use the bottom bit of ->gp_state to
record whether or not the current grace period had done its first
call to rcu_gp_fqs().

But I am not generating the patch today, just flew across the Pacific
yesterday.  ;-)

						Thanx, Paul

> >From 61a1bf6659f4f4c0c4021f185bc156f8c83f9ea5 Mon Sep 17 00:00:00 2001
> From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> Date: Fri, 4 Sep 2015 16:24:22 -0700
> Subject: [PATCH] rcu: Finish folding ->fqs_state into ->gp_state
> 
> Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
> into kthread") started the process of folding the old ->fqs_state
> into ->gp_state, but did not complete it.  This situation does not
> cause any malfunction, but can result in extremely confusing trace
> output.  This commit completes this task of eliminating ->fqs_state
> in favor of ->gp_state.
> 
> The old fqs_state had one side effect.  It was used to decide whether
> to collect dyntick-idle snapshots.  For this purpose, we add a boolean
> into the state struct.
> 
> Reported-by: Petr Mladek <pmladek@...e.com>
> Signed-off-by: Petr Mladek <pmladek@...e.com>
> ---
>  kernel/rcu/tree.c       | 17 +++++++----------
>  kernel/rcu/tree.h       | 16 +++++-----------
>  kernel/rcu/tree_trace.c |  2 +-
>  3 files changed, 13 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9f75f25cc5d9..f47067fdc783 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -98,7 +98,7 @@ struct rcu_state sname##_state = { \
>  	.level = { &sname##_state.node[0] }, \
>  	.rda = &sname##_data, \
>  	.call = cr, \
> -	.fqs_state = RCU_GP_IDLE, \
> +	.gp_state = RCU_GP_IDLE, \
>  	.gpnum = 0UL - 300UL, \
>  	.completed = 0UL - 300UL, \
>  	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
> @@ -1927,16 +1927,15 @@ static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
>  /*
>   * Do one round of quiescent-state forcing.
>   */
> -static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
> +static void rcu_gp_fqs(struct rcu_state *rsp)
>  {
> -	int fqs_state = fqs_state_in;
>  	bool isidle = false;
>  	unsigned long maxj;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
> 
>  	WRITE_ONCE(rsp->gp_activity, jiffies);
>  	rsp->n_force_qs++;
> -	if (fqs_state == RCU_SAVE_DYNTICK) {
> +	if (rsp->save_dyntick) {
>  		/* Collect dyntick-idle snapshots. */
>  		if (is_sysidle_rcu_state(rsp)) {
>  			isidle = true;
> @@ -1945,7 +1944,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
>  		force_qs_rnp(rsp, dyntick_save_progress_counter,
>  			     &isidle, &maxj);
>  		rcu_sysidle_report_gp(rsp, isidle, maxj);
> -		fqs_state = RCU_FORCE_QS;
> +		rsp->save_dyntick = false;
>  	} else {
>  		/* Handle dyntick-idle and offline CPUs. */
>  		isidle = true;
> @@ -1959,7 +1958,6 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
>  			   READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS);
>  		raw_spin_unlock_irq(&rnp->lock);
>  	}
> -	return fqs_state;
>  }
> 
>  /*
> @@ -2023,7 +2021,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  	/* Declare grace period done. */
>  	WRITE_ONCE(rsp->completed, rsp->gpnum);
>  	trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end"));
> -	rsp->fqs_state = RCU_GP_IDLE;
> +	rsp->gp_state = RCU_GP_IDLE;
>  	rdp = this_cpu_ptr(rsp->rda);
>  	/* Advance CBs to reduce false positives below. */
>  	needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp;
> @@ -2041,7 +2039,6 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>   */
>  static int __noreturn rcu_gp_kthread(void *arg)
>  {
> -	int fqs_state;
>  	int gf;
>  	unsigned long j;
>  	int ret;
> @@ -2073,7 +2070,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  		}
> 
>  		/* Handle quiescent-state forcing. */
> -		fqs_state = RCU_SAVE_DYNTICK;
> +		rsp->save_dyntick = true;
>  		j = jiffies_till_first_fqs;
>  		if (j > HZ) {
>  			j = HZ;
> @@ -2101,7 +2098,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  				trace_rcu_grace_period(rsp->name,
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqsstart"));
> -				fqs_state = rcu_gp_fqs(rsp, fqs_state);
> +				rcu_gp_fqs(rsp);
>  				trace_rcu_grace_period(rsp->name,
>  						       READ_ONCE(rsp->gpnum),
>  						       TPS("fqsend"));
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 2e991f8361e4..12303ff25077 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -412,13 +412,6 @@ struct rcu_data {
>  	struct rcu_state *rsp;
>  };
> 
> -/* Values for fqs_state field in struct rcu_state. */
> -#define RCU_GP_IDLE		0	/* No grace period in progress. */
> -#define RCU_GP_INIT		1	/* Grace period being initialized. */
> -#define RCU_SAVE_DYNTICK	2	/* Need to scan dyntick state. */
> -#define RCU_FORCE_QS		3	/* Need to force quiescent state. */
> -#define RCU_SIGNAL_INIT		RCU_SAVE_DYNTICK
> -
>  /* Values for nocb_defer_wakeup field in struct rcu_data. */
>  #define RCU_NOGP_WAKE_NOT	0
>  #define RCU_NOGP_WAKE		1
> @@ -469,15 +462,16 @@ struct rcu_state {
> 
>  	/* The following fields are guarded by the root rcu_node's lock. */
> 
> -	u8	fqs_state ____cacheline_internodealigned_in_smp;
> -						/* Force QS state. */
> -	u8	boost;				/* Subject to priority boost. */
> +	u8	boost ____cacheline_internodealigned_in_smp;
> +						/* Subject to priority boost. */
>  	unsigned long gpnum;			/* Current gp number. */
>  	unsigned long completed;		/* # of last completed gp. */
>  	struct task_struct *gp_kthread;		/* Task for grace periods. */
>  	wait_queue_head_t gp_wq;		/* Where GP task waits. */
>  	short gp_flags;				/* Commands for GP task. */
>  	short gp_state;				/* GP kthread sleep state. */
> +	bool save_dyntick;			/* Collect dyntick-idle */
> +						/* snapshots when forcing QS. */
> 
>  	/* End of fields guarded by root rcu_node's lock. */
> 
> @@ -539,7 +533,7 @@ struct rcu_state {
>  #define RCU_GP_FLAG_FQS  0x2	/* Need grace-period quiescent-state forcing. */
> 
>  /* Values for rcu_state structure's gp_flags field. */
> -#define RCU_GP_WAIT_INIT 0	/* Initial state. */
> +#define RCU_GP_IDLE	 0	/* Initial state and no GP in progress. */
>  #define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
>  #define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
>  #define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
> diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
> index 6fc4c5ff3bb5..1d61f5ba4641 100644
> --- a/kernel/rcu/tree_trace.c
> +++ b/kernel/rcu/tree_trace.c
> @@ -268,7 +268,7 @@ static void print_one_rcu_state(struct seq_file *m, struct rcu_state *rsp)
>  	gpnum = rsp->gpnum;
>  	seq_printf(m, "c=%ld g=%ld s=%d jfq=%ld j=%x ",
>  		   ulong2long(rsp->completed), ulong2long(gpnum),
> -		   rsp->fqs_state,
> +		   rsp->gp_state,
>  		   (long)(rsp->jiffies_force_qs - jiffies),
>  		   (int)(jiffies & 0xffff));
>  	seq_printf(m, "nfqs=%lu/nfqsng=%lu(%lu) fqlh=%lu oqlen=%ld/%ld\n",
> -- 
> 1.8.5.6
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ