[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <492EEF0C.9040607@google.com>
Date: Thu, 27 Nov 2008 11:03:40 -0800
From: Mike Waychison <mikew@...gle.com>
To: Nick Piggin <npiggin@...e.de>
CC: Ying Han <yinghan@...gle.com>, Ingo Molnar <mingo@...e.hu>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
akpm <akpm@...ux-foundation.org>,
David Rientjes <rientjes@...gle.com>,
Rohit Seth <rohitseth@...gle.com>,
Hugh Dickins <hugh@...itas.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"H. Peter Anvin" <hpa@...or.com>, edwintorok@...il.com
Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY
Nick Piggin wrote:
> On Thu, Nov 27, 2008 at 01:28:41AM -0800, Mike Waychison wrote:
>>> Hmm. How quantifiable is the benefit? Does it actually matter that you
>>> can read the proc file much faster? (this is for some automated workload
>>> management daemon or something, right?)
>> Correct. I don't recall the numbers from the pathelogical cases we were
>> seeing, but iirc, it was on the order of 10s of seconds, likely
>> exascerbated by slower than usual disks. I've been digging through my
>> inbox to find numbers without much success -- we've been using a variant
>> of this patch since 2.6.11.
>>
>> Török however identified mmap taking on the order of several
>> milliseconds due to this exact problem:
>>
>> http://lkml.org/lkml/2008/9/12/185
>
> Turns out to be a different problem.
>
What do you mean?
>
>>> Would it be possible to reduce mmap()/munmap() activity? eg. if it is
>>> due to a heap memory allocator, then perhaps do more batching or set
>>> some hysteresis.
>> I know our tcmalloc team had made great strides to reduce mmap_sem
>> contention for the heap, but there are various other bits of the stack
>> that really want to mmap files..
>>
>> We generally try to avoid such things, but sometimes it a) can't be
>> easily avoided (third party libraries for instance) and b) when it hits
>> us, it affects the overall health of the machine/cluster (the monitoring
>> daemons get blocked, which isn't very healthy).
>
> Are you doing appropriate posix_fadvise to prefetch in the files before
> faulting, and madvise hints if appropriate?
>
Yes, we've been slowly rolling out fadvise hints out, though not to
prefetch, and definitely not for faulting. I don't see how issuing a
prefetch right before we try to fault in a page is going to help
matters. The pages may appear in pagecache, but they won't be uptodate
by the time we look at them anyway, so we're back to square one.
The best use for fadvise we've found is FADV_DONTNEED as it kicks off
any IO for dirty pages asynchronously (except it misses metadata..).
That it drops clean pages is a nice side-benefit. With it, we don't
have to rely on the kernel's heuristics for writeout which lead to
imbalances and latency spikes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists