linux-kernel - Re: hugetlb pages not accounted for in rss

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150804025530.GA13210@hori1.linux.bs1.fc.nec.co.jp>
Date:	Tue, 4 Aug 2015 02:55:30 +0000
From:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
To:	Mike Kravetz <mike.kravetz@...cle.com>
CC:	David Rientjes <rientjes@...gle.com>,
	Jörn Engel <joern@...estorage.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: hugetlb pages not accounted for in rss

On Wed, Jul 29, 2015 at 04:20:59PM -0700, Mike Kravetz wrote:
> On 07/29/2015 12:08 PM, David Rientjes wrote:
> >On Tue, 28 Jul 2015, Jörn Engel wrote:
> >
> >>Well, we definitely need something.  Having a 100GB process show 3GB of
> >>rss is not very useful.  How would we notice a memory leak if it only
> >>affects hugepages, for example?
> >>
> >
> >Since the hugetlb pool is a global resource, it would also be helpful to
> >determine if a process is mapping more than expected.  You can't do that
> >just by adding a huge rss metric, however: if you have 2MB and 1GB
> >hugepages configured you wouldn't know if a process was mapping 512 2MB
> >hugepages or 1 1GB hugepage.
> >
> >That's the purpose of hugetlb_cgroup, after all, and it supports usage
> >counters for all hstates.  The test could be converted to use that to
> >measure usage if configured in the kernel.
> >
> >Beyond that, I'm not sure how a per-hstate rss metric would be exported to
> >userspace in a clean way and other ways of obtaining the same data are
> >possible with hugetlb_cgroup.  I'm not sure how successful you'd be in
> >arguing that we need separate rss counters for it.
>
> If I want to track hugetlb usage on a per-task basis, do I then need to
> create one cgroup per task?
>
> For example, suppose I have many tasks using hugetlb and the global pool
> is getting low on free pages.  It might be useful to know which tasks are
> using hugetlb pages, and how many they are using.
>
> I don't actually have this need (I think), but it appears to be what
> Jörn is asking for.

One possible way to get hugetlb metric in per-task basis is to walk page
table via /proc/pid/pagemap, and counting page flags for each mapped page
(we can easily do this with tools/vm/page-types.c like "page-types -p <PID>
-b huge"). This is obviously slower than just storing the counter as
in-kernel data and just exporting it, but might be useful in some situation.

Thanks,
Naoya Horiguchi