[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49C2818B.9060201@goop.org>
Date: Thu, 19 Mar 2009 10:31:55 -0700
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Nick Piggin <nickpiggin@...oo.com.au>
CC: Avi Kivity <avi@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
Xen-devel <xen-devel@...ts.xensource.com>,
Jan Beulich <jbeulich@...ell.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts
Nick Piggin wrote:
>> Also, assuming that disabling the interrupt is enough to get the
>> guarantees we need here, there's a Xen problem because we don't use IPIs
>> for cross-cpu tlb flushes (well, it happens within Xen). I'll have to
>> think a bit about how to deal with that, but I'm thinking that we could
>> add a per-cpu "tlb flushes blocked" flag, and maintain some kind of
>> per-cpu deferred tlb flush count so we can get around to doing the flush
>> eventually.
>>
>> But I want to make sure I understand the exact algorithm here.
>>
>
> FWIW, powerpc actually can flush tlbs without IPIs, and it also has
> a gup_fast. powerpc RCU frees its page _tables_ so we can walk them,
> and then I use speculative page references in order to be able to
> take a reference on the page without having it pinned.
>
Ah, interesting. So disabling interrupts prevents the RCU free from
happening, and non-atomic pte fetching is a non-issue. So it doesn't
address the PAE side of the problem.
> Turning gup_get_pte into a pvop would be a bit nasty because on !PAE
> it is just a single load, and even on PAE it is pretty cheap.
>
Well, it wouldn't be too bad; for !PAE it would turn into something we
could inline, so there'd be little to no cost. For PAE it would be out
of line, but a direct function call, which would be nicely cached and
very predictable once we've gone through the the loop once (and for Xen
I think I'd just make it a cmpxchg8b-based implementation, assuming that
the tlb flush hypercall would offset the cost of making gup_fast a bit
slower).
But it would be better if we can address it at a higher level.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists