[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070511010842.GJ31925@holomorphy.com>
Date: Thu, 10 May 2007 18:08:42 -0700
From: William Lee Irwin III <wli@...omorphy.com>
To: Hugh Dickins <hugh@...itas.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andi Kleen <ak@...e.de>, Christoph Lameter <clameter@....com>,
linux-kernel@...r.kernel.org
Subject: Re: slub-i386-support.patch
On Thu, May 10, 2007 at 05:07:02PM -0700, William Lee Irwin III wrote:
> quicklist_free() with unflushed TLB entries admits speculation through
> the pagetable entries corresponding to the list links. So tlb_finish_mmu()
> is the place to call quicklist_free() on pagetables. This requires
> distinguishing preconstructed pagetables from freed user pages, which
> is not done in include/asm-generic/tlb.h (and core callers may need
> to be adjusted, pending the results of audits).
> To clarify, upper levels of pagetables are indeed cached by x86 TLB's.
> The same kind of deferral of freeing until the TLB is flushed required
> for leaf pagetables is required for the upper levels as well.
Looking more closely at it, the entire attempt to avoid struct page
pointers is far beyond pointless. The freeing functions unconditionally
require struct page pointers to either be passed or computed and the
allocation function's virtual address it returns as a result is not
directly usable. The callers all have to do arithmetic on the result.
One might as well stash precomputed pfn's (if not paddrs) and vaddrs in
page->private and page->mapping, chain them with ->lru (use only .next
if you care to stay singly-linked), and handle struct page pointers
throughout. At that point quicklists not only become directly callable
for pagetable freeing (including upper levels) instead of needing calls
to quicklist freeing staged to occur at the time of tlb_finish_mmu(),
but also become usable for the highpte case.
The computations this is trying to save on are computing the virtual
and physical addresses (pfn's modulo a cheap shift; besides, all the
API's work on pfn's) of a page from the pointer to the struct page.
Chaining through the memory for the page incurs the cost of having to
stage freeing through tlb_finish_mmu() instead of using the quicklist
as a staging arena directly. So the translation from a struct page
pointer is not saving work. It's not saving cache, either. The page's
memory is no more likely to be hot than its struct page.
In the course of freeing the pointer to the struct page is computed
whether by the caller or the API function. So the translation to a
struct page pointer is done during freeing regardless.
A better solution would be to precompute those results and store
them in various fields of the struct page. i386 can move to using
generation numbers (->_mapcount and ->index are still available
for 64 bits there even after quicklists use ->lru, ->mapping, and
->private, and quicklists really only need half of ->lru) to handle
change_page_attr() and vmalloc_sync().
-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists