[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161201180953.GO3045@worktop.programming.kicks-ass.net>
Date: Thu, 1 Dec 2016 19:09:53 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@...nel.org>,
Donald Buczek <buczek@...gen.mpg.de>,
Paul Menzel <pmenzel@...gen.mpg.de>, dvteam@...gen.mpg.de,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Josh Triplett <josh@...htriplett.org>
Subject: Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and
`mem_cgroup_shrink_node`
On Thu, Dec 01, 2016 at 08:59:18AM -0800, Paul E. McKenney wrote:
> On Thu, Dec 01, 2016 at 05:36:14PM +0100, Peter Zijlstra wrote:
> > Well, with the above change cond_resched() is already sufficient, no?
>
> Maybe. Right now, cond_resched_rcu_qs() gets a quiescent state to
> the RCU core in less than one jiffy, with my other change, this becomes
> a handful of jiffies depending on HZ and NR_CPUS. I expect this
> increase to a handful of jiffies to be a non-event.
>
> After my upcoming patch, cond_resched() will get a quiescent state to
> the RCU core in about ten seconds. While I am am not all that nervous
> about the increase from less than a jiffy to a handful of jiffies,
> increasing to ten seconds via cond_resched() does make me quite nervous.
> Past experience indicates that someone's kernel will likely be fatally
> inconvenienced by this magnitude of change.
>
> Or am I misunderstanding what you are proposing?
No, that is indeed what I was proposing. Hurm.. OK let me ponder that a
bit. There might be a few games we can play with !PREEMPT to avoid IPIs.
Thing is, I'm slightly uncomfortable with de-coupling rcu-sched from
actual schedule() calls.
> > In fact, by doing the IPI thing we get the entire cond_resched*()
> > family, and we could add the should_resched() guard to
> > cond_resched_rcu().
>
> So that cond_resched_rcu_qs() looks something like this, in order
> to avoid the function call in the case where the scheduler has nothing
> to do?
I was actually thinking of this:
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 2d0c82e1d348..2dc7d8056b2a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3374,9 +3374,11 @@ static inline int signal_pending_state(long state, struct task_struct *p)
static inline void cond_resched_rcu(void)
{
#if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
- rcu_read_unlock();
- cond_resched();
- rcu_read_lock();
+ if (should_resched(1)) {
+ rcu_read_unlock();
+ cond_resched();
+ rcu_read_lock();
+ }
#endif
}
Powered by blists - more mailing lists