lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 4 Nov 2015 17:05:47 -0500
From:	Daniel Micay <danielmicay@...il.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Minchan Kim <minchan@...nel.org>, Hugh Dickins <hughd@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michael Kerrisk <mtk.manpages@...il.com>,
	Michal Hocko <mhocko@...e.cz>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux API <linux-api@...r.kernel.org>, Jason Evans <je@...com>,
	Shaohua Li <shli@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	yalin wang <yalin.wang2010@...il.com>,
	Mel Gorman <mgorman@...e.de>
Subject: Re: [PATCH v2 01/13] mm: support madvise(MADV_FREE)

> With enough pages at once, though, munmap would be fine, too.

That implies lots of page faults and zeroing though. The zeroing alone
is a major performance issue.

There are separate issues with munmap since it ends up resulting in a
lot more virtual memory fragmentation. It would help if the kernel used
first-best-fit for mmap instead of the current naive algorithm (bonus:
O(log n) worst-case, not O(n)). Since allocators like jemalloc and
PartitionAlloc want 2M aligned spans, mixing them with other allocators
can also accelerate the VM fragmentation caused by the dumb mmap
algorithm (i.e. they make a 2M aligned mapping, some other mmap user
does 4k, now there's a nearly 2M gap when the next 2M region is made and
the kernel keeps going rather than reusing it). Anyway, that's a totally
separate issue from this. Just felt like complaining :).

> Maybe what's really needed is a MADV_FREE variant that takes an iovec.
> On an all-cores multithreaded mm, the TLB shootdown broadcast takes
> thousands of cycles on each core more or less regardless of how much
> of the TLB gets zapped.

That would work very well. The allocator ends up having a sequence of
dirty spans that it needs to purge in one go. As long as purging is
fairly spread out, the cost of a single TLB shootdown isn't that bad. It
is extremely bad if it needs to do it over and over to purge a bunch of
ranges, which can happen if the memory has ended up being very, very
fragmentated despite the efforts to compact it (depends on what the
application ends up doing).


Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ