lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20070817205812.GA15575@linux.vnet.ibm.com>
Date:	Fri, 17 Aug 2007 13:58:12 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	ego@...ibm.com
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, dvhltc@...ibm.com,
	tytso@...ibm.com, akpm@...ux-foundation.org
Subject: Re: [PATCH RFC] CPU hotplug support for preemptible RCU

On Thu, Aug 16, 2007 at 06:26:18PM -0700, Paul E. McKenney wrote:
> On Tue, Aug 14, 2007 at 04:41:16PM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > Work in progress, not for inclusion.
> > 
> > The attached patch passes multiple hours of rcutorture while hotplugging
> > CPUs every ten seconds on 64-bit PPC and x86_64.  It fails miserably on
> > 32-bit i386 after a few hotplugs, but then again, so does stock 2.6.22
> > even without running rcutorture simultaneously.
> > 
> > Is there some extra patch or hardware dependency for CPU hotplug on
> > 32-bit i386?
> 
> Progress report: turns out to be at least partly a hardware dependency.
> I have found a couple of x86 machines that survive a goodly number of
> CPU-hotplug operations on stock 2.6.22.  They also have some other
> problems, in one case, only booting at odd intervals regardless of
> what kernel you are using.  Though when it does boot, it does seems to
> run reliably.  :-/
> 
> In contrast, doing even a small number of CPU hotplug operations on the
> other machines eventually causes themselves to NMI themselves to death
> regardless of what kernel I boot on them.  One of these machines ends up
> writing LOCK_PREFIX and nop into kernel text on CPU hotplug operations.
> While other CPUs are running.  I cannot believe that this is a good
> thing, but suppressing this with the smp-alt-boot (see the bootonly()
> function) doesn't seem to improve matters.  Even so, seems to me that
> smp-alt-boot should be enabled by default.  Unless there are objections,
> I will send a patch to do this.

And I finally located a couple of i386 machines that tolerate CPU hotplug
operations.  They both passed an overnight CPU-hotplug-plus-rcutorture
run.  ;-)

							Thanx, Paul

> > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > ---
> > 
> >  include/linux/rcuclassic.h |    2 
> >  include/linux/rcupreempt.h |    2 
> >  kernel/rcuclassic.c        |    8 +++
> >  kernel/rcupreempt.c        |   93 +++++++++++++++++++++++++++++++++++++++++++--
> >  4 files changed, 100 insertions(+), 5 deletions(-)
> > 
> > diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/include/linux/rcuclassic.h linux-2.6.22-e-hotplugcpu/include/linux/rcuclassic.h
> > --- linux-2.6.22-d-schedclassic/include/linux/rcuclassic.h	2007-08-06 10:16:16.000000000 -0700
> > +++ linux-2.6.22-e-hotplugcpu/include/linux/rcuclassic.h	2007-08-07 18:15:08.000000000 -0700
> > @@ -83,6 +83,8 @@ static inline void rcu_bh_qsctr_inc(int 
> >  #define rcu_check_callbacks_rt(cpu, user)
> >  #define rcu_init_rt()
> >  #define rcu_needs_cpu_rt(cpu) 0
> > +#define rcu_offline_cpu_rt(cpu)
> > +#define rcu_online_cpu_rt(cpu)
> >  #define rcu_pending_rt(cpu) 0
> >  #define rcu_process_callbacks_rt(unused)
> >  
> > diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/include/linux/rcupreempt.h linux-2.6.22-e-hotplugcpu/include/linux/rcupreempt.h
> > --- linux-2.6.22-d-schedclassic/include/linux/rcupreempt.h	2007-08-06 14:56:00.000000000 -0700
> > +++ linux-2.6.22-e-hotplugcpu/include/linux/rcupreempt.h	2007-08-07 18:15:10.000000000 -0700
> > @@ -59,6 +59,8 @@ extern void rcu_advance_callbacks_rt(int
> >  extern void rcu_check_callbacks_rt(int cpu, int user);
> >  extern void rcu_init_rt(void);
> >  extern int  rcu_needs_cpu_rt(int cpu);
> > +extern void rcu_offline_cpu_rt(int cpu);
> > +extern void rcu_online_cpu_rt(int cpu);
> >  extern int  rcu_pending_rt(int cpu);
> >  struct softirq_action;
> >  extern void rcu_process_callbacks_rt(struct softirq_action *unused);
> > diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/kernel/rcuclassic.c linux-2.6.22-e-hotplugcpu/kernel/rcuclassic.c
> > --- linux-2.6.22-d-schedclassic/kernel/rcuclassic.c	2007-08-06 14:07:26.000000000 -0700
> > +++ linux-2.6.22-e-hotplugcpu/kernel/rcuclassic.c	2007-08-11 08:25:55.000000000 -0700
> > @@ -404,14 +404,19 @@ static void __rcu_offline_cpu(struct rcu
> >  static void rcu_offline_cpu(int cpu)
> >  {
> >  	struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
> > +#ifdef CONFIG_CLASSIC_RCU
> >  	struct rcu_data *this_bh_rdp = &get_cpu_var(rcu_bh_data);
> > +#endif /* #ifdef CONFIG_CLASSIC_RCU */
> >  
> >  	__rcu_offline_cpu(this_rdp, &rcu_ctrlblk,
> >  					&per_cpu(rcu_data, cpu));
> > +#ifdef CONFIG_CLASSIC_RCU
> >  	__rcu_offline_cpu(this_bh_rdp, &rcu_bh_ctrlblk,
> >  					&per_cpu(rcu_bh_data, cpu));
> > -	put_cpu_var(rcu_data);
> >  	put_cpu_var(rcu_bh_data);
> > +#endif /* #ifdef CONFIG_CLASSIC_RCU */
> > +	put_cpu_var(rcu_data);
> > +	rcu_offline_cpu_rt(cpu);
> >  }
> >  
> >  #else
> > @@ -561,6 +566,7 @@ static void __devinit rcu_online_cpu(int
> >  	rdp->passed_quiesc = &per_cpu(rcu_data_passed_quiesc, cpu);
> >  	rcu_init_percpu_data(cpu, &rcu_bh_ctrlblk, bh_rdp);
> >  	bh_rdp->passed_quiesc = &per_cpu(rcu_data_bh_passed_quiesc, cpu);
> > +	rcu_online_cpu_rt(cpu);
> >  }
> >  
> >  static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
> > diff -urpNa -X dontdiff linux-2.6.22-d-schedclassic/kernel/rcupreempt.c linux-2.6.22-e-hotplugcpu/kernel/rcupreempt.c
> > --- linux-2.6.22-d-schedclassic/kernel/rcupreempt.c	2007-08-06 14:58:07.000000000 -0700
> > +++ linux-2.6.22-e-hotplugcpu/kernel/rcupreempt.c	2007-08-11 04:02:10.000000000 -0700
> > @@ -125,6 +125,8 @@ enum rcu_mb_flag_values {
> >  };
> >  static DEFINE_PER_CPU(enum rcu_mb_flag_values, rcu_mb_flag) = rcu_mb_done;
> >  
> > +static cpumask_t rcu_cpu_online_map = CPU_MASK_NONE;
> > +
> >  /*
> >   * Macro that prevents the compiler from reordering accesses, but does
> >   * absolutely -nothing- to prevent CPUs from reordering.  This is used
> > @@ -400,7 +402,7 @@ rcu_try_flip_idle(void)
> >  
> >  	/* Now ask each CPU for acknowledgement of the flip. */
> >  
> > -	for_each_possible_cpu(cpu)
> > +	for_each_cpu_mask(cpu, rcu_cpu_online_map)
> >  		per_cpu(rcu_flip_flag, cpu) = rcu_flipped;
> >  
> >  	return 1;
> > @@ -416,7 +418,7 @@ rcu_try_flip_waitack(void)
> >  	int cpu;
> >  
> >  	RCU_TRACE_ME(rcupreempt_trace_try_flip_a1);
> > -	for_each_possible_cpu(cpu)
> > +	for_each_cpu_mask(cpu, rcu_cpu_online_map)
> >  		if (per_cpu(rcu_flip_flag, cpu) != rcu_flip_seen) {
> >  			RCU_TRACE_ME(rcupreempt_trace_try_flip_ae1);
> >  			return 0;
> > @@ -460,7 +462,7 @@ rcu_try_flip_waitzero(void)
> >  
> >  	/* Call for a memory barrier from each CPU. */
> >  
> > -	for_each_possible_cpu(cpu)
> > +	for_each_cpu_mask(cpu, rcu_cpu_online_map)
> >  		per_cpu(rcu_mb_flag, cpu) = rcu_mb_needed;
> >  
> >  	RCU_TRACE_ME(rcupreempt_trace_try_flip_z2);
> > @@ -478,7 +480,7 @@ rcu_try_flip_waitmb(void)
> >  	int cpu;
> >  
> >  	RCU_TRACE_ME(rcupreempt_trace_try_flip_m1);
> > -	for_each_possible_cpu(cpu)
> > +	for_each_cpu_mask(cpu, rcu_cpu_online_map)
> >  		if (per_cpu(rcu_mb_flag, cpu) != rcu_mb_done) {
> >  			RCU_TRACE_ME(rcupreempt_trace_try_flip_me1);
> >  			return 0;
> > @@ -583,6 +585,89 @@ void rcu_advance_callbacks_rt(int cpu, i
> >  	spin_unlock_irqrestore(&rdp->lock, oldirq);
> >  }
> >  
> > +#ifdef CONFIG_HOTPLUG_CPU
> > +
> > +#define rcu_offline_cpu_rt_enqueue(srclist, srctail, dstlist, dsttail) do { \
> > +		*dsttail = srclist; \
> > +		if (srclist != NULL) { \
> > +			dsttail = srctail; \
> > +			srclist = NULL; \
> > +			srctail = &srclist;\
> > +		} \
> > +	} while (0)
> > +	
> > +
> > +void rcu_offline_cpu_rt(int cpu)
> > +{
> > +	int i;
> > +	struct rcu_head *list = NULL;
> > +	unsigned long oldirq;
> > +	struct rcu_data *rdp = RCU_DATA_CPU(cpu);
> > +	struct rcu_head **tail = &list;
> > +
> > +	/* Remove all callbacks from the newly dead CPU, retaining order. */
> > +
> > +	spin_lock_irqsave(&rdp->lock, oldirq);
> > +	rcu_offline_cpu_rt_enqueue(rdp->donelist, rdp->donetail, list, tail);
> > +	for (i = GP_STAGES - 1; i >= 0; i--)
> > +		rcu_offline_cpu_rt_enqueue(rdp->waitlist[i], rdp->waittail[i],
> > +					   list, tail);
> > +	rcu_offline_cpu_rt_enqueue(rdp->nextlist, rdp->nexttail, list, tail);
> > +	spin_unlock_irqrestore(&rdp->lock, oldirq);
> > +	rdp->waitlistcount = 0;
> > +
> > +	/* Disengage the newly dead CPU from grace-period computation. */
> > +
> > +	spin_lock_irqsave(&rcu_ctrlblk.fliplock, oldirq);
> > +	rcu_check_mb(cpu);
> > +	if (per_cpu(rcu_flip_flag, cpu) == rcu_flipped) {
> > +		smp_mb();  /* Subsequent counter accesses must see new value */
> > +		per_cpu(rcu_flip_flag, cpu) = rcu_flip_seen;
> > +		smp_mb();  /* Subsequent RCU read-side critical sections */
> > +			   /*  seen -after- acknowledgement. */
> > +	}
> > +	cpu_clear(cpu, rcu_cpu_online_map);
> > +	spin_unlock_irqrestore(&rcu_ctrlblk.fliplock, oldirq);
> > +
> > +	/*
> > +	 * Place the removed callbacks on the current CPU's queue.
> > +	 * Make them all start a new grace period: simple approach,
> > +	 * in theory could starve a given set of callbacks, but
> > +	 * you would need to be doing some serious CPU hotplugging
> > +	 * to make this happen.  If this becomes a problem, adding
> > +	 * a synchronize_rcu() to the hotplug path would be a simple
> > +	 * fix.
> > +	 */
> > +
> > +	rdp = RCU_DATA_ME();
> > +	spin_lock_irqsave(&rdp->lock, oldirq);
> > +	*rdp->nexttail = list;
> > +	if (list)
> > +		rdp->nexttail = tail;
> > +	spin_unlock_irqrestore(&rdp->lock, oldirq);
> > +}
> > +
> > +void __devinit rcu_online_cpu_rt(int cpu)
> > +{
> > +	unsigned long oldirq;
> > +
> > +	spin_lock_irqsave(&rcu_ctrlblk.fliplock, oldirq);
> > +	cpu_set(cpu, rcu_cpu_online_map);
> > +	spin_unlock_irqrestore(&rcu_ctrlblk.fliplock, oldirq);
> > +}
> > +
> > +#else /* #ifdef CONFIG_HOTPLUG_CPU */
> > +
> > +void rcu_offline_cpu(int cpu)
> > +{
> > +}
> > +
> > +void __devinit rcu_online_cpu_rt(int cpu)
> > +{
> > +}
> > +
> > +#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
> > +
> >  void rcu_process_callbacks_rt(struct softirq_action *unused)
> >  {
> >  	unsigned long flags;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ