lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081121155333.GB6775@linux.vnet.ibm.com>
Date:	Fri, 21 Nov 2008 07:53:33 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Folkert van Heusden <folkert@...heusden.com>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>, linux-kernel@...r.kernel.org
Subject: Re: [2.6.28-rc5] RCU detected CPU 0 stall (t=4294893165/750
	jiffies)

On Fri, Nov 21, 2008 at 04:34:26PM +0100, Folkert van Heusden wrote:
> > > I'm afraid there's no script for that: it happens during boot.
> > 
> > This is a HZ=250 machine, correct?  If so, please try the following
> > patch (already in -tip), which helps suppress boot-time false positives.
> 
> That's correct, 250Hz.
> 
> > diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
> > index 5f89b62..301dda8 100644
> > --- a/include/linux/rcuclassic.h
> > +++ b/include/linux/rcuclassic.h
> > @@ -41,7 +41,7 @@
> >  #include <linux/seqlock.h>
> >  
> >  #ifdef CONFIG_RCU_CPU_STALL_DETECTOR
> > -#define RCU_SECONDS_TILL_STALL_CHECK	( 3 * HZ) /* for rcp->jiffies_stall */
> > +#define RCU_SECONDS_TILL_STALL_CHECK	(10 * HZ) /* for rcp->jiffies_stall */
> 
> Isn't it better to let the define depend on the value of CONFIG_HZ?
> E.g.
> 
> Signed-off-by: Folkert van Heusden <folkert@...heusden.com>
> 
> diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
> index 5f89b62..301dda8 100644
> --- a/include/linux/rcuclassic.h
> +++ b/include/linux/rcuclassic.h
> @@ -41,7 +41,7 @@
>  #include <linux/seqlock.h>
> 
>  #ifdef CONFIG_RCU_CPU_STALL_DETECTOR
> -#define RCU_SECONDS_TILL_STALL_CHECK	( 3 * HZ) /* for rcp->jiffies_stall */
> +#define RCU_SECONDS_TILL_STALL_CHECK	( (CONFIG_HZ / 100) * 3 * HZ) /* for rcp->jiffies_stall */
>  #define RCU_SECONDS_TILL_STALL_RECHECK	(30 * HZ) /* for rcp->jiffies_stall */
>  #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */

The stalls occur when CPUs spin in the kernel with preemption (or irqs
or whatever) disabled.  So while I suppose that there is some
possibility that such a spin might be a function of HZ, I have never
seen this happen.

The reason I asked for your HZ value was to make sure that the stall
detection was 3 seconds (750 jiffies).  If you had been running a
75HZ system (admittedly unlikely) you would have seen a 10-second stall,
and the patch would not help.  In that case, the right thing to do would
have been to work out why the system was spinning for 10 seconds during
boot -- tough to get a 5-second boot when the system spins for 10
seconds coming up, right?  ;-)

							Thanx, Paul

> Folkert van Heusden
> 
> -- 
> MultiTail na wan makriki wrokosani fu tan luku den logfile nanga san
> den commando spiti puru. Piki puru spesrutu sani, wroko nanga difrenti
> kroru, tya kon makandra, nanga wan lo moro.
> http://www.vanheusden.com/multitail/
> ----------------------------------------------------------------------
> Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ