lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 03 Nov 2013 20:22:51 -0800
From:	Davidlohr Bueso <davidlohr@...com>
To:	KOSAKI Motohiro <kosaki.motohiro@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Hugh Dickins <hughd@...gle.com>,
	Michel Lespinasse <walken@...gle.com>,
	Ingo Molnar <mingo@...nel.org>, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>,
	Guan Xuetao <gxt@...c.pku.edu.cn>, aswin@...com,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH] mm: cache largest vma

On Sun, 2013-11-03 at 18:57 -0500, KOSAKI Motohiro wrote:
> >> I'm slightly surprised this cache makes 15% hit. Which application
> >> get a benefit? You listed a lot of applications, but I'm not sure
> >> which is highly depending on largest vma.
> >
> > Well I chose the largest vma because it gives us a greater chance of
> > being already cached when we do the lookup for the faulted address.
> >
> > The 15% improvement was with Hadoop. According to my notes it was at
> > ~48% with the baseline kernel and increased to ~63% with this patch.
> >
> > In any case I didn't measure the rates on a per-task granularity, but at
> > a general system level. When a system is first booted I can see that the
> > mmap_cache access rate becomes the determinant factor and when adding a
> > workload it doesn't change much. One exception to this was a kernel
> > build, where we go from ~50% to ~89% hit rate on a vanilla kernel.
> 
> I looked at this patch a bit. The worth of this is to improve the
> cache hit ratio
> of heap.
> 
> 1) For single thread applications, heap is frequently largest mapping
> in the process.

Right.

> 2) For java VM, "java -Xms1000m -Xmx1000m HelloWorld" makes following
> /proc/<pid>/smaps entry. That said, JVM allocate single heap even if
> applications are multi threaded.

Oh, this is new to me and nicely explains why I see the most benefit in
java related workloads.

> 
> c1800000-100000000 rw-p 00000000 00:00 0
> Size:            1024000 kB
> Rss:                 244 kB
> Pss:                 244 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:       244 kB
> Referenced:          244 kB
> Anonymous:           244 kB
> AnonHugePages:         0 kB
> Swap:                  0 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> 
> That's good.
> 
> However, we know there is a situation that this patch doesn't work.
> glibc makes per thread heap (arena) by default. So, it is not to be
> expected works well on glibc multi threaded programs. That's a
> slightly big limitation.

I think this is what Linus was referring to.

> 
> Anyway, I haven't observed real performance difference because most
> big penalty of find_vma come from taking mmap_sem, not rb-tree search.

Yes, undoubtedly, which is why I'm using units of hit/miss rather than
workload throughput.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ