[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120507205516.GD21152@linux.vnet.ibm.com>
Date: Mon, 7 May 2012 13:55:16 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Warning in worker_enter_idle()
On Mon, May 07, 2012 at 12:40:42PM -0700, Tejun Heo wrote:
> Hello, Paul.
>
> On Sun, May 06, 2012 at 08:38:14AM -0700, Paul E. McKenney wrote:
> > Hello!
> >
> > The worker_enter_idle() is complaining that there all workers are idle,
> > but that there is work remaining:
> >
> > /* sanity check nr_running */
> > WARN_ON_ONCE(gcwq->nr_workers == gcwq->nr_idle &&
> > atomic_read(get_gcwq_nr_running(gcwq->cpu)));
> >
> > This is running on Power, .config attached. I must confess that I don't
> > see any sort of synchronization or memory barriers that would keep the
> > counts straight on a weakly ordered system. Or is there some clever
> > design constraint that prevents worker_enter_idle() from accessing other
> > CPUs' gcwq_nr_running variables?
>
> Workers are tied to global cpu workqueues (gcwqs). There's one gcwq
> per cpu and one unbound one, so yeah, workers access these counters
> under gcwq->lock. Atomic accesses to nr_running is depended on only
> while nr_idle is adjusted under gcwq->lock, so there shouldn't be a
> discrepancy there. Can you reproduce the problem? What was going on
> the system? Was CPU being brought up or down?
I was running rcutorture with CPU hotplug operations. It has happened
a couple of times on the .config that I attached, but never under any
of the other 13 .configs that I run.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists