lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 3 Nov 2013 18:57:14 -0500
From:	KOSAKI Motohiro <kosaki.motohiro@...il.com>
To:	Davidlohr Bueso <davidlohr@...com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Hugh Dickins <hughd@...gle.com>,
	Michel Lespinasse <walken@...gle.com>,
	Ingo Molnar <mingo@...nel.org>, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>,
	Guan Xuetao <gxt@...c.pku.edu.cn>, aswin@...com,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH] mm: cache largest vma

>> I'm slightly surprised this cache makes 15% hit. Which application
>> get a benefit? You listed a lot of applications, but I'm not sure
>> which is highly depending on largest vma.
>
> Well I chose the largest vma because it gives us a greater chance of
> being already cached when we do the lookup for the faulted address.
>
> The 15% improvement was with Hadoop. According to my notes it was at
> ~48% with the baseline kernel and increased to ~63% with this patch.
>
> In any case I didn't measure the rates on a per-task granularity, but at
> a general system level. When a system is first booted I can see that the
> mmap_cache access rate becomes the determinant factor and when adding a
> workload it doesn't change much. One exception to this was a kernel
> build, where we go from ~50% to ~89% hit rate on a vanilla kernel.

I looked at this patch a bit. The worth of this is to improve the
cache hit ratio
of heap.

1) For single thread applications, heap is frequently largest mapping
in the process.
2) For java VM, "java -Xms1000m -Xmx1000m HelloWorld" makes following
/proc/<pid>/smaps entry. That said, JVM allocate single heap even if
applications are multi threaded.

c1800000-100000000 rw-p 00000000 00:00 0
Size:            1024000 kB
Rss:                 244 kB
Pss:                 244 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:       244 kB
Referenced:          244 kB
Anonymous:           244 kB
AnonHugePages:         0 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

That's good.

However, we know there is a situation that this patch doesn't work.
glibc makes per thread heap (arena) by default. So, it is not to be
expected works well on glibc multi threaded programs. That's a
slightly big limitation.

Anyway, I haven't observed real performance difference because most
big penalty of find_vma come from taking mmap_sem, not rb-tree search.

Another and additional input are welcome. But I myself haven't convinced
this patch works everywhere.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ