lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170616173658.GA451@linux.vnet.ibm.com>
Date:   Fri, 16 Jun 2017 10:36:58 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     jiangshanlai@...il.com, linux-kernel@...r.kernel.org
Subject: Re: WARN_ON_ONCE() in process_one_work()?

On Thu, Jun 15, 2017 at 08:38:57AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 14, 2017 at 08:15:48AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 13, 2017 at 03:31:03PM -0700, Paul E. McKenney wrote:
> > > On Tue, Jun 13, 2017 at 04:58:37PM -0400, Tejun Heo wrote:
> > > > Hello, Paul.
> > > > 
> > > > On Fri, May 05, 2017 at 10:11:59AM -0700, Paul E. McKenney wrote:
> > > > > Just following up...  I have hit this bug a couple of times over the
> > > > > past few days.  Anything I can do to help?
> > > > 
> > > > My apologies for dropping the ball on this.  I've gone over the hot
> > > > plug code in workqueue several times but can't really find how this
> > > > would happen.  Can you please apply the following patch and see what
> > > > it says when the problem happens?
> > > 
> > > I have fired it up, thank you!
> > > 
> > > Last time I saw one failure in 21 hours of test runs, so I have kicked
> > > of 42 one-hour test runs.  Will see what happens tomorrow morning,
> > > Pacific Time.
> > 
> > And none of the 42 runs resulted in a workqueue splat.  I will try again
> > this evening, Pacific Time.
> > 
> > Who knows, maybe your diagnostic patch is the fix.  ;-)
> 
> And this time, we did get something!  Here is the printk() output:
> 
> [ 2126.863410] XXX workfn=vmstat_update pool->cpu/flags=1/0x0 curcpu=2 online=0-2,7 active=0,2,7
> 
> Please see below for the full splat from dmesg.
> 
> Please let me know if you need additional email.  My test ID is KSIC
> 2017.06.14-15:50:08/TREE07.14, just to help me find it in my large pile
> of test results.  ;-)

And no test failures from yesterday evening.  So it looks like we get
somewhere on the order of one failure per 138 hours of TREE07 rcutorture
runtime with your printk() in the mix.

Was the above output from your printk() output of any help?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ