lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160817093125.GA27782@pc.thejh.net>
Date:	Wed, 17 Aug 2016 11:31:25 +0200
From:	Jann Horn <jann@...jh.net>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	Robert Foss <robert.foss@...labora.com>, sonnyrao@...omium.org,
	corbet@....net, akpm@...ux-foundation.org, vbabka@...e.cz,
	koct9i@...il.com, hughd@...gle.com, n-horiguchi@...jp.nec.com,
	minchan@...nel.org, john.stultz@...aro.org,
	ross.zwisler@...ux.intel.com, jmarchan@...hat.com,
	hannes@...xchg.org, keescook@...omium.org, viro@...iv.linux.org.uk,
	gorcunov@...nvz.org, plaguedbypenguins@...il.com,
	rientjes@...gle.com, eric.engestrom@...tec.com, jdanis@...gle.com,
	calvinowens@...com, adobriyan@...il.com,
	kirill.shutemov@...ux.intel.com, ldufour@...ux.vnet.ibm.com,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	Ben Zhang <benzh@...omium.org>,
	Bryan Freed <bfreed@...omium.org>,
	Filipe Brandenburger <filbranden@...omium.org>,
	Mateusz Guzik <mguzik@...hat.com>
Subject: Re: [PACTH v2 0/3] Implement /proc/<pid>/totmaps

On Wed, Aug 17, 2016 at 10:22:00AM +0200, Michal Hocko wrote:
> On Tue 16-08-16 12:46:51, Robert Foss wrote:
> [...]
> > $ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2}
> > /^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\'
> > /proc/5025/smaps }"
> > [...]
> > 	Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
> > /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\' /proc/5025/smaps
> > }"
> > 	User time (seconds): 0.37
> > 	System time (seconds): 0.45
> > 	Percent of CPU this job got: 92%
> > 	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89
> 
> This is really unexpected. Where is the user time spent? Anyway, rather
> than measuring some random processes I've tried to measure something
> resembling the worst case. So I've created a simple program to mmap as
> much as possible:
> 
> #include <sys/mman.h>
> #include <sys/types.h>
> #include <unistd.h>
> #include <stdio.h>
> int main()
> {
> 	while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED)
> 		;
> 
> 	printf("pid:%d\n", getpid());
> 	pause();
> 	return 0;
> }

Ah, nice, that's a reasonable test program. :)


> So with a reasonable user space the parsing is really not all that time
> consuming wrt. smaps handling. That being said I am still very skeptical
> about a dedicated proc file which accomplishes what userspace can done
> in a trivial way.

Now, since your numbers showed that all the time is spent in the kernel,
also create this test program to just read that file over and over again:

$ cat justreadloop.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sched.h>
#include <unistd.h>
#include <err.h>
#include <stdio.h>

char buf[1000000];

int main(int argc, char **argv) {
  printf("pid:%d\n", getpid());
  while (1) {
    int fd = open(argv[1], O_RDONLY);
    if (fd < 0) continue;
    if (read(fd, buf, sizeof(buf)) < 0)
      err(1, "read");
    close(fd);
  }
}
$ gcc -Wall -o justreadloop justreadloop.c
$ 

Now launch your test:

$ ./mapstuff 
pid:29397

point justreadloop at it:

$ ./justreadloop /proc/29397/smaps
pid:32567

... and then check the performance stats of justreadloop:

# perf top -p 32567

This is what I see:

Samples: 232K of event 'cycles:ppp', Event count (approx.): 60448424325
Overhead  Shared Object     Symbol
  30,43%  [kernel]          [k] format_decode
   9,12%  [kernel]          [k] number
   7,66%  [kernel]          [k] vsnprintf
   7,06%  [kernel]          [k] __lock_acquire
   3,23%  [kernel]          [k] lock_release
   2,85%  [kernel]          [k] debug_lockdep_rcu_enabled
   2,25%  [kernel]          [k] skip_atoi
   2,13%  [kernel]          [k] lock_acquire
   2,05%  [kernel]          [k] show_smap

That's at least 30.43% + 9.12% + 7.66% = 47.21% of the task's kernel
time spent on evaluating format strings. The new interface
wouldn't have to spend that much time on format strings because there
isn't so much text to format. (My kernel is built with a
bunch of debug options - the results might look very different on
distro kernels or so, so please try this yourself.)

I guess it could be argued that this is not just a problem with
smaps, but also a problem with format strings (or text-based interfaces
in general) just being slow in general.

(Here is a totally random and crazy thought: Can we put something into
the kernel build process that replaces printf calls that use simple
format strings with equivalent non-printf calls? Move the cost of
evaluating the format string to compile time?)

Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ