linux-kernel - Re: missing madvise functionality

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <46145476.8080504@yahoo.com.au>
Date:	Thu, 05 Apr 2007 11:44:22 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Hugh Dickins <hugh@...itas.com>
CC:	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jakub Jelinek <jakub@...hat.com>,
	Ulrich Drepper <drepper@...hat.com>,
	Andi Kleen <andi@...stfloor.org>,
	Linux Kernel <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: missing madvise functionality

Hugh Dickins wrote:
> On Wed, 4 Apr 2007, Rik van Riel wrote:
> 
>>Hugh Dickins wrote:
>>
>>
>>>(I didn't understand how Rik would achieve his point 5, _no_ lock
>>>contention while repeatedly re-marking these pages, but never mind.)
>>
>>The CPU marks them accessed&dirty when they are reused.
>>
>>The VM only moves the reused pages back to the active list
>>on memory pressure.  This means that when the system is
>>not under memory pressure, the same page can simply stay
>>PG_lazyfree for multiple malloc/free rounds.
> 
> 
> Sure, there's no need for repetitious locking at the LRU end of it;
> but you said "if the system has lots of free memory, pages can go
> through multiple free/malloc cycles while sitting on the dontneed
> list, very lazily with no lock contention".  I took that to mean,
> with userspace repeatedly madvising on the ranges they fall in,
> which will involve mmap_sem and ptl each time - just in order
> to check that no LRU movement is required each time.
> 
> (Of course, there's also the problem that we don't leave our
> systems with lots of free memory: some LRU balancing decisions.)

I don't agree this approach is the best one anyway. I'd rather
just the simple MADV_DONTNEED/MADV_DONEED.

Once you go through the trouble of protecting the memory and
flushing TLBs, unprotecting them afterwards and taking a trap
(even if it is a pure hardware trap), I doubt you've saved much.

You may have saved the cost of zeroing out the page, but that
has to be weighed against the fact that you have left a possibly
cache hot page sitting there to get cold, and your accesses to
initialise the malloced memory might have more cache misses.

If you just free the page, it goes onto a nice LIFO cache hot
list, and when you want to allocate another one, you'll probably
get a cache hot one.

The problem is down_write(mmap_sem) isn't it? We can and should
easily fix that problem now. If we subsequently want to look at
micro optimisations to avoid zeroing using MMU tricks, then we
have a good base to compare with.

-- 
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/