lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170315135059.GC2177@redhat.com>
Date:   Wed, 15 Mar 2017 10:50:59 -0300
From:   Arnaldo Carvalho de Melo <acme@...hat.com>
To:     Stephane Eranian <eranian@...gle.com>
Cc:     linux-kernel@...r.kernel.org, peterz@...radead.org,
        Ingo Molnar <mingo@...nel.org>, Jiri Olsa <jolsa@...hat.com>,
        namhyung.kim@...nel.org, acme@...nel.org
Subject: Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events()
 scale

Em Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian escreveu:
> This patch significantly improves the execution time of
> perf_event__synthesize_mmap_events() when running perf record
> on systems where processes have lots of threads. It just happens
> that cat /proc/pid/maps support uses a O(N^2) algorithm to generate
> each map line in the maps file.  If you have 1000 threads, then you have
> necessarily 1000 stacks.  For each vma, you need to check if it corresponds
> to a thread's stack.  With a large number of threads, this can take a very long time. I have seen latencies >> 10mn.
> 
> As of today, perf does not use the fact that a mapping is a stack,
> therefore we can work around the issue by using /proc/pid/tasks/pid/maps.
> This entry does not try to map a vma to stack and is thus much
> faster with no loss of functonality.
> 
> The proc-map-timeout logic is kept in case user still want some uppre limit.
> 
> Signed-off-by: Stephane Eranian <eranian@...gle.com>
> ---
>  tools/perf/util/event.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 4ea7ce7..b137566 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
>  	if (machine__is_default_guest(machine))
>  		return 0;
>  
> -	snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
> -		 machine->root_dir, pid);
> +	snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps",
> +		 machine->root_dir, pid, pid);

Humm...

[root@...et ~]# ps -C firefox -L
  PID   LWP TTY          TIME CMD
25023 25023 tty2     01:32:10 firefox
25023 25048 tty2     00:00:00 gmain
25023 25049 tty2     00:00:00 gdbus
25023 25056 tty2     00:00:00 Gecko_IOThread
25023 25058 tty2     00:00:00 Link Monitor
25023 25059 tty2     00:03:52 Socket Thread
25023 25060 tty2     00:00:00 JS Watchdog
25023 25061 tty2     00:00:12 JS Helper
25023 25062 tty2     00:00:12 JS Helper
25023 25063 tty2     00:00:13 JS Helper
25023 25064 tty2     00:00:12 JS Helper
25023 25065 tty2     00:00:13 JS Helper
25023 25066 tty2     00:00:12 JS Helper
25023 25067 tty2     00:00:13 JS Helper
25023 25068 tty2     00:00:13 JS Helper
25023 25069 tty2     00:00:00 Hang Monitor
25023 25070 tty2     00:00:00 BgHangManager
25023 25117 tty2     00:00:12 Cache2 I/O
25023 25118 tty2     00:05:37 Timer
25023 25146 tty2     00:00:00 DataStorage
25023 25150 tty2     00:00:00 GMPThread
25023 25152 tty2     00:04:12 Compositor
25023 25156 tty2     00:00:02 ImgDecoder #1
25023 25157 tty2     00:00:02 ImgDecoder #2
25023 25158 tty2     00:00:02 ImgDecoder #3
25023 25159 tty2     00:00:00 ImageIO
25023 25160 tty2     00:02:26 SoftwareVsyncTh
25023 25202 tty2     00:00:00 HTML5 Parser
25023 25204 tty2     00:00:00 IPDL Background
25023 25212 tty2     00:00:22 DOM Worker
25023 25219 tty2     00:00:23 URL Classifier
25023 25220 tty2     00:00:05 ImageBridgeChil
25023 25221 tty2     00:00:04 mozStorage #1
25023 25222 tty2     00:00:00 Proxy R~olution
25023 25223 tty2     00:00:00 DataStorage
25023 25234 tty2     00:00:00 Cache I/O
25023 25235 tty2     00:00:01 mozStorage #2
25023 25236 tty2     00:00:00 mozStorage #3
25023 25252 tty2     00:00:20 DOM Worker
25023 25256 tty2     00:00:00 mozStorage #4
25023 25257 tty2     00:00:00 localStorage DB
25023 25260 tty2     00:00:00 mozStorage #5
25023  2066 tty2     00:00:19 DOM Worker
25023  5988 tty2     00:00:00 threaded-ml
25023  6757 tty2     00:00:00 firefox
25023  6999 tty2     00:00:00 mozStorage #6
25023 19239 tty2     00:00:02 mozStorage #7
25023 12038 tty2     00:00:05 threaded-ml
25023 19301 tty2     00:00:00 firefox
[root@...et ~]# 

[root@...et ~]# file /proc/25023/tasks
/proc/25023/tasks: cannot open `/proc/25023/tasks' (No such file or directory)
[root@...et ~]#

But...

[root@...et ~]# file /proc/25023/task
/proc/25023/task: directory
[root@...et ~]# wc -l /proc/25023/task/25023/maps
1200 /proc/25023/task/25023/maps
[root@...et ~]# wc -l /proc/25023/task/25048/maps
1201 /proc/25023/task/25048/maps
[root@...et ~]# wc -l /proc/25048/task/25048/maps
1201 /proc/25048/task/25048/maps
[root@...et ~]#

[acme@...et linux]$ grep \"tasks\"  fs/proc/*.c
[acme@...et linux]$ grep \"task\"  fs/proc/*.c
fs/proc/base.c:	DIR("task",       S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations),
fs/proc/base.c:	name.name = "task";
[acme@...et linux]$

These end up mapping to the later of these two:

const struct file_operations proc_pid_maps_operations = {
        .open           = pid_maps_open,
        .read           = seq_read,
        .llseek         = seq_lseek,
        .release        = proc_map_release,
};

const struct file_operations proc_tid_maps_operations = {
        .open           = tid_maps_open,
        .read           = seq_read, 
        .llseek         = seq_lseek,
        .release        = proc_map_release,
};


Which ends up going down to:

show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)

with that is_pid respectively true and false, but then show_map_vma() doesn't
use 'is_pid' at all :-\

/me scratches head, what am I missing?

The following commit, that appeared circa v4.9-rc2. So, fixing up the "tasks"
-> "tasks" we end up with something safe and that avoids this by now
non-existing problem, on older kernels, ok?

commit b18cb64ead400c01bf1580eeba330ace51f8087d
Author: Andy Lutomirski <luto@...nel.org>
Date:   Fri Sep 30 10:58:57 2016 -0700

    fs/proc: Stop trying to report thread stacks
    
    This reverts more of:
    
      b76437579d13 ("procfs: mark thread stack correctly in proc/<pid>/maps")
    
    ... which was partially reverted by:
    
      65376df58217 ("proc: revert /proc/<pid>/maps [stack:TID] annotation")
    
    Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps.
    
    In current kernels, /proc/PID/maps (or /proc/TID/maps even for
    threads) shows "[stack]" for VMAs in the mm's stack address range.
    
    In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the
    target thread's stack's VMA.  This is racy, probably returns garbage
    and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone:
    KSTK_ESP is not safe to use on tasks that aren't known to be running
    ordinary process-context kernel code.
    
    This patch removes the difference and just shows "[stack]" for VMAs
    in the mm's stack range.  This is IMO much more sensible -- the
    actual "stack" address really is treated specially by the VM code,
    and the current thread stack isn't even well-defined for programs
    that frequently switch stacks on their own.
    
    Reported-by: Jann Horn <jann@...jh.net>
    Signed-off-by: Andy Lutomirski <luto@...nel.org>
    Acked-by: Thomas Gleixner <tglx@...utronix.de>


>  
>  	fp = fopen(filename, "r");
>  	if (fp == NULL) {
> -- 
> 2.5.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ