[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150917154152.GD12808@leverpostej>
Date: Thu, 17 Sep 2015 16:41:52 +0100
From: Mark Rutland <mark.rutland@....com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Ingo Molnar <mingo@...hat.com>, Jiri Olsa <jolsa@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] perf tools: session: avoid infinite loop
Hi,
On Wed, Sep 16, 2015 at 09:54:54PM +0100, Arnaldo Carvalho de Melo wrote:
> Em Wed, Sep 16, 2015 at 06:18:49PM +0100, Mark Rutland escreveu:
> > This has been observed to result in an exit-time hang when counting
> > rare/unschedulable events with perf record, and can be triggered
> > artificially with the script below:
> >
> > ----
> > #!/bin/sh
> > printf "REPRO: launching perf\n";
> > ./perf record -e software/config=9/ sleep 1 &
> > PERF_PID=$!;
> > sleep 0.002;
> > kill -2 $PERF_PID;
> > printf "REPRO: waiting for perf (%d) to exit...\n" "$PERF_PID";
> > wait $PERF_PID;
> > printf "REPRO: perf exited\n";
> > ----
>
> So, I run it here, without this patch, and get:
>
> [root@zoo ~]# time ./repro.sh
> REPRO: launching perf
> REPRO: waiting for perf (766) to exit...
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.015 MB perf.data ]
> REPRO: perf exited
> real 0m1.060s
> user 0m0.018s
> sys 0m0.037s
[...]
> What am I doing wrong? Trying to reproduce this before even looking at
> the patch :-)
I suspect you have a shinier computer than I do! ;)
It's easier to trigger on slower machines -- decreasing the sleep time
to 0.001 or below may help, though if it's set too low we exit early
without exercising the failing path.
It looks like to trigger the bug we need to successfully call
record__mmap_read_all at least once (but not read any data), then break out of
the loop because hits == rec->samples && done. That way we call
process_buildids with no data having been written.
I added some additional logging of rec->bytes_written immediately before and
after the main loop. In all cases where we've written something we exit, and in
all cases we didn't, we hang. I've left some examples (from an arm64 system)
below.
Thanks,
Mark.
mark@...bensteg:~$ ./repro.sh
REPRO: launching perf
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Waiting for perf (2104) to exit...
Cannot read kernel map
Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
Pre-loop rec wrote 0
Post-loop rec wrote 2128
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data ]
Perf exited
mark@...bensteg:~$ ./repro.sh
REPRO: launching perf
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Waiting for perf (2099) to exit...
Cannot read kernel map
Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
Pre-loop rec wrote 0
Post-loop rec wrote 0
[ perf record: Woken up 0 times to write data ]
< hang here >
mark@...bensteg:~$ ./repro.sh
REPRO: launching perf
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Cannot read kernel map
Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
Pre-loop rec wrote 0
Waiting for perf (2108) to exit...
Post-loop rec wrote 0
[ perf record: Woken up 1 times to write data ]
< hang here >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists