[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20160714171550.GX7094@linux.vnet.ibm.com>
Date: Thu, 14 Jul 2016 10:15:50 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Tejun Heo <tj@...nel.org>, John Stultz <john.stultz@...aro.org>,
Ingo Molnar <mingo@...hat.com>,
lkml <linux-kernel@...r.kernel.org>,
Dmitry Shmidt <dimitrysh@...gle.com>,
Rom Lemarchand <romlem@...gle.com>,
Colin Cross <ccross@...gle.com>, Todd Kjos <tkjos@...gle.com>,
Oleg Nesterov <oleg@...hat.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
locking changes
On Thu, Jul 14, 2016 at 06:45:47PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 14, 2016 at 09:23:55AM -0700, Paul E. McKenney wrote:
> > Hmmm... How does this handle the following sequence of events for
> > the case where we are not biased towards the reader?
> >
> > o The per-CPU rwsem is set up with RCU_NONE and readers_slow
> > (as opposed to readers_block). The rcu_sync ->gp_state is
> > GP_PENDING, which means that rcu_sync_is_idle() will always
> > return true.
>
> /false/, rcu_sync_is_idle() will always be false, to force us into the
> slowpath.
Ah, got it...
> > o Task A on CPU 0 runs percpu_down_read() to completion, and remains
> > within the critical section. CPU 0's ->refcount is therefore 1.
> >
> > o Task B on CPU 1 does percpu_down_write(), which write-acquires
> > ->rw_sem, does rcu_sync_enter() (which is a no-op due to
> > RCU_NONE), sets ->state to readers_block, and is just now going
> > to wait for readers, which first invokes readers_active_check(),
> > which starts summing the ->refcount for CPUs 0, 1, and 2,
> > obtaining the value 1 so far.
> >
> > o Task C CPU 2 enters percpu_down_read(), disables preemption,
> > increments CPU 2's ->refcount, sees rcu_sync_is_idle() return
> > true (so skips __percpu_down_read()), enables preemption, and
> > enters its critical section.
>
> false, so does __percpu_down_read()
>
> >
> > o Task C migrates to CPU 3 and invokes percpu_up_read(), which
> > disables preemption, sees rcu_sync_is_idle() return true, calls
> > __this_cpu_dec() on CPU 3's ->refcount, and enables preemption.
> > The value of CPU 3's ->refcount is thus (unsigned int)-1.
>
> __percpu_up_read()
>
> >
> > o Task B on CPU 1 continues execution in readers_active_check(), with
> > the full sum being zero.
> >
> > So it looks to me like we have Task A as a writer at the same time that
> > Task A is a reader, which would not be so good.
> >
> > So what am I missing here?
>
> for RCU_NONE we init rsp->gp_state to !0, which makes:
>
> static inline rcu_sync_is_idle()'s
>
> return !rsp->gp_state (aka. rsp->gp_state == 0)
>
> return false.
>
> > And a couple of checkpatch nits below. Yes, I had to apply the patch to
> > figure out what it was doing. ;-)
>
> Yah, too much churn to read :-)
>
> In any case, rest assured you've already gone over this part of the
> patch several times. I repurposed an old percpu-rwsem optimization, Oleg
> recognised it.
OK, in that case I will hold off pending John Stultz's performance
checks.
Thanx, Paul
Powered by blists - more mailing lists