[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081123091843.GK30453@elte.hu>
Date: Sun, 23 Nov 2008 10:18:44 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Ying Han <yinghan@...gle.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
akpm <akpm@...ux-foundation.org>,
Mike Waychison <mikew@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Rohit Seth <rohitseth@...gle.com>,
Hugh Dickins <hugh@...itas.com>, Nick Piggin <npiggin@...e.de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY
* Ying Han <yinghan@...gle.com> wrote:
> page fault retry with NOPAGE_RETRY
Interesting patch.
> Allow major faults to drop the mmap_sem read lock while waitting for
> synchronous disk read. This allows another thread which wishes to grab
> down_read(mmap_sem) to proceed while the current is waitting the disk IO.
Do you mean down_write()? down_read() can already be nested
arbitrarily.
> The patch flags current->flags to PF_FAULT_MAYRETRY as identify that
> the caller can tolerate the retry in the filemap_fault call patch.
>
> Benchmark is done by mmap in huge file and spaw 64 thread each
> faulting in pages in reverse order, the the result shows 8%
> porformance hit with the patch.
I suspect we also want to see the cases where this change helps?
Also, constructs like this are pretty ugly:
> +#ifdef CONFIG_X86_64
> +asmlinkage
> +#endif
> +void do_page_fault(struct pt_regs *regs, unsigned long error_code)
> +{
> + current->flags |= PF_FAULT_MAYRETRY;
> + __do_page_fault(regs, error_code);
> + current->flags &= ~PF_FAULT_MAYRETRY;
> +}
This seems to be unnecessary runtime overhead to pass in a flag to
handle_mm_fault(). Why not extend the 'write' flag of
handle_mm_fault() to also signal "arch is able to retry"?
Also, _if_ we decide that from-scratch pagefault retries are good, i
see no reason why this should not be extended to all architectures:
The retry should happen purely in the MM layer - all information is
available already, and much of do_page_fault() could generally be
moved into mm/memory.c, with one or two arch-provided standard
callbacks to express certain page fault quirks. (such as vm86 mode on
x86)
(Such a design would allow more nice cleanups - handle_mm_fault()
could inline inside the pagefault handler, etc.)
Also, a few small details. Please use this proper multi-line comment
style:
> + /*
> + * Page is already locked by someone else.
> + *
> + * We don't want to be holding down_read(mmap_sem)
> + * inside lock_page(). We use wait_on_page_lock here
> + * to just wait until the page is unlocked, but we
> + * don't really need
> + * to lock it.
> + */
Not this one:
> + /* page may be available, but we have to restart the process
> + * because mmap_sem was dropped during the ->fault */
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists