[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwLtxH6D9U43oq4Pt+pjXS9rVoudVLdvGr8qiE4E+z8fw@mail.gmail.com>
Date: Fri, 6 Mar 2015 13:42:05 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Davidlohr Bueso <dave@...olabs.net>
Cc: Jason Low <jason.low2@...com>, Ingo Molnar <mingo@...nel.org>,
Sasha Levin <sasha.levin@...cle.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Dave Jones <davej@...emonkey.org.uk>
Subject: Re: sched: softlockups in multi_cpu_stop
On Fri, Mar 6, 2015 at 11:55 AM, Davidlohr Bueso <dave@...olabs.net> wrote:
>>
>> - look up the vma in the vma lookup cache
>
> But you'd still need mmap_sem there to at least get the VMA's first
> value.
So my theory was that the vma cache is such a trivial data structure
that we could trivially make it be rcu-protected.
The vma allocations are already SLAB_DESTROY_BY_RCU, because we play
games with the anon-vma stuff. Or something. i forget the exact
details.
So I think that vmacache_find() would *already* basically work under
just the RCU read lock, and we can look at the resulting vma without
having to worry about it getting free'd.
Yes, the actual field values may change (ie start/end offsets etc due
to vma merging etc), but again, that's not necessarily deadly if we
are careful and make use of the vmacache sequence number. We can
optimistically do things like page cache lookups (which is already RCU
safe), and then before we actually *use* the result, we do another
vmacache sequence number validation.
So I *think* we could do at least that limited "we hit in the vma
cache, and it's a nice normal simple vma with regular vma ops" with
just a RCU read lock, and skip the mmap_sem entirely. Of course, we'd
have to fall back on the mmap_sem if anything fails (not in the vma
cache, or the sequence number changes before we can actually insert
the result in the page tables etc).
The page cache is already RCU-safe, and the actual page table
operations are protected by another lock anyway (which should scale
better because it's a spinlock and held for shorter times, _and_ is
spread out by pte address).
Is it some trivial one-liner? No. But I suspect we could make a trial
"lockless page lookup for the simple cases that hit in the caches"
without a *lot* of effort.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists