lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110511201349.GB2258@linux.vnet.ibm.com>
Date:	Wed, 11 May 2011 13:13:49 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Yinghai Lu <yinghai@...nel.org>
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL rcu/next] rcu commits for 2.6.40

On Tue, May 10, 2011 at 11:42:24PM -0700, Yinghai Lu wrote:
> On 05/10/2011 09:54 PM, Paul E. McKenney wrote:
> > On Tue, May 10, 2011 at 01:52:52PM -0700, Yinghai Lu wrote:
> >> On 05/10/2011 12:32 PM, Paul E. McKenney wrote:
> >>> On Tue, May 10, 2011 at 11:04:57AM -0700, Yinghai Lu wrote:
> >>>> On 05/10/2011 01:56 AM, Paul E. McKenney wrote:
> >>>>> On Mon, May 09, 2011 at 02:09:21PM -0700, Yinghai Lu wrote:
> >>>>>> On Mon, May 9, 2011 at 12:36 AM, Ingo Molnar <mingo@...e.hu> wrote:
> >>>>>>>
> >>>>>>> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> >>>>>>>
> >>>>>>>> Hello, Ingo,
> >>>>>>>>
> >>>>>>>> This pull request covers RCU chnages for 2.6.40.  The major new features
> >>>>>>>> are RCU priority boosting and the addition of kfree_rcu(), the latter
> >>>>>>>> courtesy of Lai Jiangshan.  These two features cover well over half
> >>>>>>>> of the commits.  There are a number of smaller features and bug fixes.
> >>>>>>>> All have been sent to LKML in the following batches:
> >>>>>>>>
> >>>>>>>> 0.    https://lkml.org/lkml/2011/2/22/660: RCU priority boosting preview
> >>>>>>>> 1.    https://lkml.org/lkml/2011/5/1/19: RCU priority boosting, kfree_rcu()
> >>>>>>>> 2.    https://lkml.org/lkml/2011/5/2/40: More uses of kfree_rcu()
> >>>>>>>> 3.    https://lkml.org/lkml/2011/5/8/60: miscellaneous
> >>>>>>>>
> >>>>>>>> The kfree_rcu() uses in the pull request have Acked-by:s from the
> >>>>>>>> maintainers.  I have some additional kfree_rcu() requests that lack
> >>>>>>>> Acked-by:s, and I will deal with these later.
> >>>>>>>>
> >>>>>>>> These channges are available in the -rcu git repository at:
> >>>>>>>>
> >>>>>>>>   git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/next
> >>>>>>>
> >>>>>>> Pulled, thanks a lot Paul!
> >>>>>>>
> >>>>>>
> >>>>>> it seems with this one in tip, my 8 sockets test setup will report cpu stall.
> >>>>>>
> >>>>>> after hard code to enable rcu_cpu_stall_suppress
> >>>>>>
> >>>>>> Index: linux-2.6/kernel/rcutree.c
> >>>>>> ===================================================================
> >>>>>> --- linux-2.6.orig/kernel/rcutree.c
> >>>>>> +++ linux-2.6/kernel/rcutree.c
> >>>>>> @@ -174,7 +174,7 @@ module_param(blimit, int, 0);
> >>>>>>  module_param(qhimark, int, 0);
> >>>>>>  module_param(qlowmark, int, 0);
> >>>>>>
> >>>>>> -int rcu_cpu_stall_suppress __read_mostly;
> >>>>>> +int rcu_cpu_stall_suppress __read_mostly = 1;
> >>>>>>  module_param(rcu_cpu_stall_suppress, int, 0644);
> >>>>>>
> >>>>>>  static void force_quiescent_state(struct rcu_state *rsp, int relaxed);
> >>>>>>
> >>>>>> will get system hang after pnp ACPI init.
> >>>>>
> >>>>> Could you please send the stack traces from the RCU CPU stall?  Also,
> >>>>> you do have ce31332d3c77532d6ea97ddcb475a2b02dd358b4 applied, correct?
> >>>>>
> >>>>> 							Thanx, Paul
> >>>>
> >>>> Do not have time to bisect it at this point.
> >>>
> >>> Could you please send the stack traces from the RCU CPU stall?
> > 
> > Thank you!  OK, so CPU 0 has not been responding, despite resched IPIs.
> > Everyone is idle, except for CPU 124, which detected the stall, and
> > possibly CPU 0, which has csum_partial_copy_generic() on the stack, though
> > that looks like a backtrace error to me.  The fact that it hangs if you
> > disable RCU CPU stall detection leads me to believe that something real
> > is being detected.
> > 
> > This looks very similar to the situation people were seeing before
> > ce31332d3c77532d6ea97ddcb475a2b02dd358b4 was applied, so I have attached
> > the diagnostic script that helped track this down.  Could you please
> > enable CONFIG_RCU_TRACE, mount debugfs, and run the attached script,
> > and send me the output?  Please check to make sure that the script knows
> > where you mounted debugfs, of course.

And this data shows RCU running normally, which goes along with your
saying that this was happening only at boot time.

Thank you for collecting it.

So help me understand what is happening.  You get the stall warning
during boot, then sometimes a hang?  Or does disabling the stall warnings
prevent the boot-time hang?

							Thanx, Paul

> Wed May 11 14:33:35 UTC 2011
> /sys/kernel/debug/rcu/rcugp:
> rcu_sched: completed=4164  gpnum=4165
> rcu_bh: completed=-282  gpnum=18446744073709551334
> /sys/kernel/debug/rcu/rcuhier:
> rcu_sched:
> c=4164 g=4165 s=2 jfq=0 j=92fc nfqs=6792/nfqsng=0(6792) fqlh=737
> ff/ff ..>.. 0:254 ^0    
> fdff/ffff ..>.. 0:15 ^0    fffc/ffff ..>.. 16:31 ^1    ffff/ffff ..>.. 32:47 ^2    ffff/ffff ..>.. 48:63 ^3    fdff/ffff ..>.. 64:79 ^4    ffff/ffff ..>.. 80:95 ^5    ffff/ffff ..>.. 96:111 ^6    ffff/ffff ..>.. 112:127 ^7    0/0 ..>.. 128:143 ^8    0/0 ..>.. 144:159 ^9    0/0 ..>.. 160:175 ^10    0/0 ..>.. 176:191 ^11    0/0 ..>.. 192:207 ^12    0/0 ..>.. 208:223 ^13    0/0 ..>.. 224:239 ^14    0/0 ..>.. 240:254 ^15    

...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ