lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1508101727230.28691@chino.kir.corp.google.com>
Date:	Mon, 10 Aug 2015 17:37:54 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Jörn Engel <joern@...estorage.com>,
	Mike Kravetz <mike.kravetz@...cle.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Naoya Horiguchi <nao.horiguchi@...il.com>
Subject: Re: [PATCH v2 1/2] smaps: fill missing fields for vma(VM_HUGETLB)

On Fri, 7 Aug 2015, Naoya Horiguchi wrote:

> Currently smaps reports many zero fields for vma(VM_HUGETLB), which is
> inconvenient when we want to know per-task or per-vma base hugetlb usage.
> This patch enables these fields by introducing smaps_hugetlb_range().
> 
> before patch:
> 
>   Size:              20480 kB
>   Rss:                   0 kB
>   Pss:                   0 kB
>   Shared_Clean:          0 kB
>   Shared_Dirty:          0 kB
>   Private_Clean:         0 kB
>   Private_Dirty:         0 kB
>   Referenced:            0 kB
>   Anonymous:             0 kB
>   AnonHugePages:         0 kB
>   Swap:                  0 kB
>   KernelPageSize:     2048 kB
>   MMUPageSize:        2048 kB
>   Locked:                0 kB
>   VmFlags: rd wr mr mw me de ht
> 
> after patch:
> 
>   Size:              20480 kB
>   Rss:               18432 kB
>   Pss:               18432 kB
>   Shared_Clean:          0 kB
>   Shared_Dirty:          0 kB
>   Private_Clean:         0 kB
>   Private_Dirty:     18432 kB
>   Referenced:        18432 kB
>   Anonymous:         18432 kB
>   AnonHugePages:         0 kB
>   Swap:                  0 kB
>   KernelPageSize:     2048 kB
>   MMUPageSize:        2048 kB
>   Locked:                0 kB
>   VmFlags: rd wr mr mw me de ht
> 

I think this will lead to breakage, unfortunately, specifically for users 
who are concerned with resource management.

An example: we use memcg hierarchies to charge memory for individual jobs, 
specific users, and system overhead.  Memcg is a cgroup, so this is done 
for an aggregate of processes, and we often have to monitor their memory 
usage.  Each process isn't assigned to its own memcg, and I don't believe 
common users of memcg assign individual processes to their own memcgs.  

When a memcg is out of memory, we need to track the memory usage of 
processes attached to its memcg hierarchy to determine what is unexpected, 
either as a result of a new rollout or because of a memory leak.  To do 
that, we use the rss exported by smaps that is now changed with this 
patch.  By using smaps rather than /proc/pid/status, we can report where 
memory usage is unexpected.

This would cause our process that manages all memcgs on our systems to 
break.  Perhaps I haven't been as convincing in my previous messages of 
this, but it's quite an obvious userspace regression.

This memory was not included in rss originally because memory in the 
hugetlb persistent pool is always resident.  Unmapping the memory does not 
free memory.  For this reason, hugetlb memory has always been treated as 
its own type of memory.

It would have been arguable back when hugetlbfs was introduced whether it 
should be included.  I'm afraid the ship has sailed on that since a decade 
has past and it would cause userspace to break if existing metrics are 
used that already have cleared defined semantics.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ