[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20160713205211.GN7094@linux.vnet.ibm.com>
Date: Wed, 13 Jul 2016 13:52:11 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Tejun Heo <tj@...nel.org>, John Stultz <john.stultz@...aro.org>,
Ingo Molnar <mingo@...hat.com>,
lkml <linux-kernel@...r.kernel.org>,
Dmitry Shmidt <dimitrysh@...gle.com>,
Rom Lemarchand <romlem@...gle.com>,
Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
locking changes
On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
> > Hello, John.
> >
> > On Wed, Jul 13, 2016 at 01:13:11PM -0700, John Stultz wrote:
> > > On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo <tj@...nel.org> wrote:
> > > > On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
> > > >> One interesting thing to try would be replacing it with a regular
> > > >> non-percpu rwsem and see how it behaves. That should easily tell us
> > > >> whether this is from actual contention or artifacts from percpu_rwsem
> > > >> implementation.
> > > >
> > > > So, something like the following. Can you please see whether this
> > > > makes any difference?
> > >
> > > Yea. So this brings it down for me closer to what we're seeing with
> > > the Dmitry's patch reverting the two problematic commits, usually
> > > 10-50us with one early spike at 18ms.
> >
> > So, it's a percpu rwsem issue then. I haven't really followed the
> > perpcpu rwsem changes closely. Oleg, are multi-milisec delay expected
> > on down write expected with the current implementation of
> > percpu_rwsem?
>
> There is a synchronize_sched() in there, so sorta. That thing is heavily
> geared towards readers, as is the only 'sane' choice for global locks.
Then one diagnostic step to take would be to replace that
synchronize_sched() with synchronize_sched_expedited(), and see if that
gets rid of the delays.
Not a particularly real-time-friendly fix, but certainly a good check
on our various assumptions.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index be922c9f3d37..211acddc7e21 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -38,19 +38,19 @@ static const struct {
#endif
} gp_ops[] = {
[RCU_SYNC] = {
- .sync = synchronize_rcu,
+ .sync = synchronize_rcu_expedited,
.call = call_rcu,
.wait = rcu_barrier,
__INIT_HELD(rcu_read_lock_held)
},
[RCU_SCHED_SYNC] = {
- .sync = synchronize_sched,
+ .sync = synchronize_sched_expedited,
.call = call_rcu_sched,
.wait = rcu_barrier_sched,
__INIT_HELD(rcu_read_lock_sched_held)
},
[RCU_BH_SYNC] = {
- .sync = synchronize_rcu_bh,
+ .sync = synchronize_rcu_bh_expedited,
.call = call_rcu_bh,
.wait = rcu_barrier_bh,
__INIT_HELD(rcu_read_lock_bh_held)
Powered by blists - more mailing lists