[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150811233237.GA32192@hori1.linux.bs1.fc.nec.co.jp>
Date: Tue, 11 Aug 2015 23:32:38 +0000
From: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
To: David Rientjes <rientjes@...gle.com>
CC: Andrew Morton <akpm@...ux-foundation.org>,
Jörn Engel <joern@...estorage.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Naoya Horiguchi <nao.horiguchi@...il.com>
Subject: Re: [PATCH v2 1/2] smaps: fill missing fields for vma(VM_HUGETLB)
On Mon, Aug 10, 2015 at 05:37:54PM -0700, David Rientjes wrote:
> On Fri, 7 Aug 2015, Naoya Horiguchi wrote:
>
> > Currently smaps reports many zero fields for vma(VM_HUGETLB), which is
> > inconvenient when we want to know per-task or per-vma base hugetlb usage.
> > This patch enables these fields by introducing smaps_hugetlb_range().
> >
> > before patch:
> >
> > Size: 20480 kB
> > Rss: 0 kB
> > Pss: 0 kB
> > Shared_Clean: 0 kB
> > Shared_Dirty: 0 kB
> > Private_Clean: 0 kB
> > Private_Dirty: 0 kB
> > Referenced: 0 kB
> > Anonymous: 0 kB
> > AnonHugePages: 0 kB
> > Swap: 0 kB
> > KernelPageSize: 2048 kB
> > MMUPageSize: 2048 kB
> > Locked: 0 kB
> > VmFlags: rd wr mr mw me de ht
> >
> > after patch:
> >
> > Size: 20480 kB
> > Rss: 18432 kB
> > Pss: 18432 kB
> > Shared_Clean: 0 kB
> > Shared_Dirty: 0 kB
> > Private_Clean: 0 kB
> > Private_Dirty: 18432 kB
> > Referenced: 18432 kB
> > Anonymous: 18432 kB
> > AnonHugePages: 0 kB
> > Swap: 0 kB
> > KernelPageSize: 2048 kB
> > MMUPageSize: 2048 kB
> > Locked: 0 kB
> > VmFlags: rd wr mr mw me de ht
> >
>
> I think this will lead to breakage, unfortunately, specifically for users
> who are concerned with resource management.
>
> An example: we use memcg hierarchies to charge memory for individual jobs,
> specific users, and system overhead. Memcg is a cgroup, so this is done
> for an aggregate of processes, and we often have to monitor their memory
> usage. Each process isn't assigned to its own memcg, and I don't believe
> common users of memcg assign individual processes to their own memcgs.
>
> When a memcg is out of memory, we need to track the memory usage of
> processes attached to its memcg hierarchy to determine what is unexpected,
> either as a result of a new rollout or because of a memory leak. To do
> that, we use the rss exported by smaps that is now changed with this
> patch. By using smaps rather than /proc/pid/status, we can report where
> memory usage is unexpected.
>
> This would cause our process that manages all memcgs on our systems to
> break. Perhaps I haven't been as convincing in my previous messages of
> this, but it's quite an obvious userspace regression.
OK, this version assumes that userspace distinguishes vma(VM_HUGETLB) with
"VmFlags" field, which is unrealistic. So I'll keep all existing fields
untouched by introducing hugetlb usage info.
> This memory was not included in rss originally because memory in the
> hugetlb persistent pool is always resident. Unmapping the memory does not
> free memory. For this reason, hugetlb memory has always been treated as
> its own type of memory.
Right, so it might be better not to use the word "RSS" for hugetlb, maybe
something like "HugetlbPages:" seems better to me.
Thanks,
Naoya Horiguchi
> It would have been arguable back when hugetlbfs was introduced whether it
> should be included. I'm afraid the ship has sailed on that since a decade
> has past and it would cause userspace to break if existing metrics are
> used that already have cleared defined semantics.
Powered by blists - more mailing lists