[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <516CC675.8020903@linaro.org>
Date:	Mon, 15 Apr 2013 20:33:09 -0700
From:	John Stultz <john.stultz@...aro.org>
To:	Minchan Kim <minchan.kernel.2@...il.com>
CC:	KOSAKI Motohiro <kosaki.motohiro@...il.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Michael Kerrisk <mtk.manpages@...il.com>,
	Arun Sharma <asharma@...com>, Mel Gorman <mel@....ul.ie>,
	Hugh Dickins <hughd@...gle.com>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>, Neil Brown <neilb@...e.de>,
	Mike Hommey <mh@...ndium.org>, Taras Glek <tglek@...illa.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Jason Evans <je@...com>, sanjay@...gle.com,
	Paul Turner <pjt@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Michel Lespinasse <walken@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC v7 00/11] Support vrange for anonymous page
On 04/14/2013 12:42 AM, Minchan Kim wrote:
> Hi KOSAKI,
>
> On Thu, Apr 11, 2013 at 11:01:11AM -0400, KOSAKI Motohiro wrote:
>>>>>> and adding new syscall invokation is unwelcome.
>>>>> Sure. But one more system call could be cheaper than page-granuarity
>>>>> operation on purged range.
>>>> I don't think vrange(VOLATILE) cost is the related of this discusstion.
>>>> Whether sending SIGBUS or just nuke pte, purge should be done on vmscan,
>>>> not vrange() syscall.
>>> Again, please see the MADV_FREE. http://lwn.net/Articles/230799/
>>> It does changes pte and page flags on all pages of the range through
>>> zap_pte_range. So it would make vrange(VOLASTILE) expensive and
>>> the bigger cost is, the bigger range is.
>> This haven't been crossed my mind. now try_to_discard_one() insert vrange
>> for making SIGBUS. then, we can insert pte_none() as the same cost too. Am
>> I missing something?
> For your requirement, we need some tracking model to detect some page is
> using by the process currently before VM discards it *if* we don't give
> vrange(NOVOLATILE) pair system call(Look at below). So the tracking model
> should be formed in vrange(VOLATILE) system call context.
To further clarify Minchan's note here, the reason its important for the 
application to use vrange(NOVOLATILE), its really to help define _when 
the range stops being volatile_.
In your libc hack to use vrange(), you see the benfit of not immediately 
purging the memory as you do with MADV_DONTNEED. However, if the heap 
grows again, and those address are re-used, nothing has stopped those 
pages from continuing to be volatile. Thus the kernel could then decide 
to purge those pages after they start to be used again, and you'd lose 
data. I suspect that's not what you want. :)
Rik's MADV_FREE implementation is very similar to vrange(VOLATILE), but 
has an implicit vrange(NOVOLATILE) on any page write. So by dirtying a 
page, it stops the kernel from later purging it.
This MADV_FREE semantic works very well if you always want zerofill (as 
in the case of malloc/free). But for other data, its important to know 
something was lost (as a zero page could be valid data), and that's why 
we provide the SIGBUS, as well as the purged notification on 
vrange(NOVOLATILE).
In other-words, as long as you do a vrange(NOVOLATILE) when you grow the 
heap again (before its used), it should be very similar to the MADV_FREE 
behavior, but is more flexible for other use cases.
thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
