linux-kernel - Re: [RFC v1][PATCH]page_fault retry with NOPAGE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <493074D8.3080002@google.com>
Date:	Fri, 28 Nov 2008 14:46:48 -0800
From:	Mike Waychison <mikew@...gle.com>
To:	Nick Piggin <npiggin@...e.de>
CC:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ying Han <yinghan@...gle.com>, Ingo Molnar <mingo@...e.hu>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	akpm <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Rohit Seth <rohitseth@...gle.com>,
	Hugh Dickins <hugh@...itas.com>,
	"H. Peter Anvin" <hpa@...or.com>, edwintorok@...il.com
Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY

Nick Piggin wrote:
> On Thu, Nov 27, 2008 at 11:22:57AM -0800, Mike Waychison wrote:
>> Nick Piggin wrote:
>>> On Thu, Nov 27, 2008 at 11:00:07AM +0100, Peter Zijlstra wrote:
>>> pagemap_read looks like it can use get_user_pages_fast. The smaps and
>>> clear_refs stuff might have been nicer if they could work on ranges
>>> like pagemap. Then they could avoid mmap_sem as well (although maps
>>> would need to be sampled and take mmap_sem I guess).
>>>
>>> One problem with dropping mmap_sem is that it hurts priority/fairness.
>>> And it opens a bit of a (maybe theoretical but not something to completely
>>> ignore) forward progress hole AFAIKS. If mmap_sem is very heavily
>>> contended, then the refault is going to take a while to get through,
>>> and then the page might get reclaimed etc).
>> Right, this can be an issue.  The way around it should be to minimize 
>> the length of time any single lock holder can sit on it.  Compared to 
>> what we have today with:
>>
>>   - sleep in major fault with read lock held,
>>   - enqueue writer behind it,
>>   - and make all other faults wait on the rwsem
>>
>> The retry logic seems to be a lot better for forward progress.
> 
> The whole reason why you have the latency is because it is
> guaranteeing forward progress for everyone. The retry logic
> may work out better in that situation, but it does actually
> open a starvation hole.
> 

Right.  In practice though, we haven't seen this cause a problem (and is 
why we'll only allow the path to retry once).

Do you have any suggestions on how we could plug this hole?   Perhaps we 
could pin a reference to the page in the vm_fault structure across the 
retry?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/