[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0801301129260.30568@schroedinger.engr.sgi.com>
Date: Wed, 30 Jan 2008 11:35:06 -0800 (PST)
From: Christoph Lameter <clameter@....com>
To: Robin Holt <holt@....com>
cc: Andrea Arcangeli <andrea@...ranet.com>,
Avi Kivity <avi@...ranet.com>, Izik Eidus <izike@...ranet.com>,
Nick Piggin <npiggin@...e.de>, kvm-devel@...ts.sourceforge.net,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>, steiner@....com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
daniel.blueman@...drics.com, Hugh Dickins <hugh@...itas.com>
Subject: Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges
On Wed, 30 Jan 2008, Robin Holt wrote:
> I think I need to straighten this discussion out in my head a little bit.
> Am I correct in assuming Andrea's original patch set did not have any SMP
> race conditions for KVM? If so, then we need to start looking at how to
> implement Christoph's and my changes in a safe fashion. Andrea, I agree
> complete that our introduction of the range callouts have introduced
> SMP races.
The original patch drew the clearing of the sptes into ptep_clear_flush().
So the invalidate_page was called for each page regardless if we had been
doing an invalidate range before or not. It seems that the the
invalidate_range() was just there for optimization.
> The three issues we need to simultaneously solve is revoking the remote
> page table/tlb information while still in a sleepable context and not
> having the remote faulters become out of sync with the granting process.
> Currently, I don't see a way to do that cleanly with a single callout.
You could use the invalidate_page callouts to set a flag that no
additional rmap entries may be added until the invalidate_range has
occurred? We could add back all the original invalidate_pages() and pass
a flag that specifies that an invalidate range will follow. The notifier
can then decide what to do with that information. If its okay to defer
then do nothing and wait for the range_invalidate. XPmem could stop
allowing external references to be established until the invalidate_range
was successful.
Jack had a concern that multiple callouts for the same pte could cause
problems.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists