[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140317101300.GA27965@twins.programming.kicks-ass.net>
Date: Mon, 17 Mar 2014 11:13:00 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: mingo@...e.hu, josh@...htriplett.org, laijs@...fujitsu.com,
linux-kernel@...r.kernel.org
Subject: Re: cond_resched() and RCU CPU stall warnings
On Sat, Mar 15, 2014 at 06:59:14PM -0700, Paul E. McKenney wrote:
> So I have been tightening up rcutorture a bit over the past year.
> The other day, I came across what looked like a great opportunity for
> further tightening, namely the schedule() in rcu_torture_reader().
> Why not turn this into a cond_resched(), speeding up the readers a bit
> and placing more stress on RCU?
>
> And boy does it increase stress!
>
> Unfortunately, this increased stress sometimes shows up in the form of
> lots of RCU CPU stall warnings. These can appear when an instance of
> rcu_torture_reader() gets a CPU to itself, in which case it won't ever
> enter the scheduler, and RCU will never see a quiescent state from that
> CPU, which means the grace period never ends.
>
> So I am taking a more measured approach to cond_resched() in
> rcu_torture_reader() for the moment.
>
> But longer term, should cond_resched() imply a set of RCU
> quiescent states? One way to do this would be to add calls to
> rcu_note_context_switch() in each of the various cond_resched() functions.
> Easy change, but of course adds some overhead. On the other hand,
> there might be more than a few of the 500+ calls to cond_resched() that
> expect that RCU CPU stalls will be prevented (to say nothing of
> might_sleep() and cond_resched_lock()).
>
> Thoughts?
I share Mike's concern. Some of those functions might be too expensive
to do in the loops where we have the cond_resched()s. And while its only
strictly required when nr_running==1, keying off off that seems
unfortunate in that it makes things behave differently with a single
running task.
I suppose your proposed per-cpu counter is the best option; even though
its still an extra cacheline hit in cond_resched().
As to the other cond_resched() variants; they might be a little more
tricky, eg. cond_resched_lock() would have you drop the lock in order to
note the QS, etc.
So one thing that might make sense is to have something like
rcu_should_qs() which will indicate RCUs need for a grace period end.
Then we can augment the various should_resched()/spin_needbreak() etc.
with that condition.
That also gets rid of the counter (or at least hides it in the
implementation if RCU really can't do anything better).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists