[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130928192123.GA8228@gmail.com>
Date: Sat, 28 Sep 2013 21:21:23 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Waiman Long <Waiman.Long@...com>, Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Rik van Riel <riel@...hat.com>,
Peter Hurley <peter@...leysoftware.com>,
Davidlohr Bueso <davidlohr.bueso@...com>,
Alex Shi <alex.shi@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrea Arcangeli <aarcange@...hat.com>,
Matthew R Wilcox <matthew.r.wilcox@...el.com>,
Dave Hansen <dave.hansen@...el.com>,
Michel Lespinasse <walken@...gle.com>,
Andi Kleen <andi@...stfloor.org>,
"Chandramouleeswaran, Aswin" <aswin@...com>,
"Norton, Scott J" <scott.norton@...com>
Subject: Re: [PATCH] rwsem: reduce spinlock contention in wakeup code path
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Sat, Sep 28, 2013 at 12:41 AM, Ingo Molnar <mingo@...nel.org> wrote:
> >
> >
> > Yeah, I fully agree. The reason I'm still very sympathetic to Tim's
> > efforts is that they address a regression caused by a mechanic
> > mutex->rwsem conversion:
> >
> > 5a505085f043 mm/rmap: Convert the struct anon_vma::mutex to an rwsem
> >
> > ... and Tim's patches turn that regression into an actual speedup.
>
> Btw, I really hate that thing. I think we should turn it back into a
> spinlock. None of what it protects needs a mutex or an rwsem.
>
> Because you guys talk about the regression of turning it into a rwsem,
> but nobody talks about the *original* regression.
>
> And it *used* to be a spinlock, and it was changed into a mutex back in
> 2011 by commit 2b575eb64f7a. That commit doesn't even have a reason
> listed for it, although my dim memory of it is that the reason was
> preemption latency.
Yeah, I think it was latency.
> And that caused big regressions too.
>
> Of course, since then, we may well have screwed things up and now we
> sleep under it, but I still really think it was a mistake to do it in
> the first place.
>
> So if the primary reason for this is really just that f*cking anon_vma
> lock, then I would seriously suggest:
>
> - turn it back into a spinlock (or rwlock_t, since we subsequently
> separated the read and write paths)
>
> - fix up any breakage (ie new scheduling points) that exposes
>
> - look at possible other approaches wrt latency on that thing.
>
> Hmm?
If we do that then I suspect the next step will be queued rwlocks :-/ The
current rwlock_t implementation is rather primitive by modern standards.
(We'd probably have killed rwlock_t long ago if not for the
tasklist_lock.)
But yeah, it would work and conceptually a hard spinlock fits something as
lowlevel as the anon-vma lock.
I did a quick review pass and it appears nothing obvious is scheduling
with the anon-vma lock held. If it did in a non-obvious way it's likely a
bug anyway. The hugepage code grew a lot of logic running under the
anon-vma lock, but it all seems atomic.
So a conversion to rwlock_t could be attempted. (It should be relatively
easy patch as well, because the locking operation is now nicely abstracted
out.)
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists