linux-kernel - Re: [PACTH v2 0/3] Implement /proc/<pid>/totmaps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160817082200.GA10547@dhcp22.suse.cz>
Date:	Wed, 17 Aug 2016 10:22:00 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Robert Foss <robert.foss@...labora.com>
Cc:	sonnyrao@...omium.org, corbet@....net, akpm@...ux-foundation.org,
	vbabka@...e.cz, koct9i@...il.com, hughd@...gle.com,
	n-horiguchi@...jp.nec.com, minchan@...nel.org,
	john.stultz@...aro.org, ross.zwisler@...ux.intel.com,
	jmarchan@...hat.com, hannes@...xchg.org, keescook@...omium.org,
	viro@...iv.linux.org.uk, gorcunov@...nvz.org,
	plaguedbypenguins@...il.com, rientjes@...gle.com,
	eric.engestrom@...tec.com, jdanis@...gle.com, calvinowens@...com,
	adobriyan@...il.com, jann@...jh.net,
	kirill.shutemov@...ux.intel.com, ldufour@...ux.vnet.ibm.com,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	Ben Zhang <benzh@...omium.org>,
	Bryan Freed <bfreed@...omium.org>,
	Filipe Brandenburger <filbranden@...omium.org>,
	Mateusz Guzik <mguzik@...hat.com>
Subject: Re: [PACTH v2 0/3] Implement /proc/<pid>/totmaps

On Tue 16-08-16 12:46:51, Robert Foss wrote:
[...]
> $ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2}
> /^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\'
> /proc/5025/smaps }"
> [...]
> 	Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
> /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps
> }"
> 	User time (seconds): 0.37
> 	System time (seconds): 0.45
> 	Percent of CPU this job got: 92%
> 	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89

This is really unexpected. Where is the user time spent? Anyway, rather
than measuring some random processes I've tried to measure something
resembling the worst case. So I've created a simple program to mmap as
much as possible:

#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
	while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED)
		;

	printf("pid:%d\n", getpid());
	pause();
	return 0;
}

so depending on /proc/sys/vm/max_map_count you will get the maximum
possible mmaps. I am using a default so 65k mappings. Then I have
retried your 25x file parsing:
$ cat s.sh
#!/bin/sh

pid=$1
for i in $(seq 25)
do
	awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}' /proc/$pid/smaps
done

But I am getting different results from you:
$ awk '/^[0-9a-f]/{print}' /proc/14808/smaps | wc -l
65532
[...]
        Command being timed: "sh s.sh 14808"
        User time (seconds): 0.00
        System time (seconds): 20.10
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:20.20

The results are stable when I try multiple times, in fact there
shouldn't be any reason for them not to be. Then I went on to increase
max_map_count to 250k and that behaves consistently:
$ awk '/^[0-9a-f]/{print}' /proc/16093/smaps | wc -l     
250002
[...]
        Command being timed: "sh s.sh 16093"
        User time (seconds): 0.00
        System time (seconds): 77.93
        Percent of CPU this job got: 98%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:19.09

So with a reasonable user space the parsing is really not all that time
consuming wrt. smaps handling. That being said I am still very skeptical
about a dedicated proc file which accomplishes what userspace can done
in a trivial way.
-- 
Michal Hocko
SUSE Labs