lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 30 Jan 2015 12:22:16 -0300
From:	Arnaldo Carvalho de Melo <acme@...nel.org>
To:	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
Cc:	Jiri Olsa <jolsa@...hat.com>, linux-kernel@...r.kernel.org,
	zhlbj@...ibm.com
Subject: Re: perf_evlist__filter_pollfd() in trace__run()

Em Thu, Jan 29, 2015 at 03:55:22PM -0800, Sukadev Bhattiprolu escreveu:
> Arnaldo,
> 
> On one of our systems we are seeing an intermittent SIGSEGV with
> 
> 	perf trace sleep 1
> 
> and I have question about the 'draining' flag below:
> | 
> | From 46fb3c21d20415dd2693570c33d0ea6eb8745e04 Mon Sep 17 00:00:00 2001
> | From: Arnaldo Carvalho de Melo <acme@...hat.com>
> | Date: Mon, 22 Sep 2014 14:39:48 -0300
> | Subject: [PATCH 1/1] perf trace: Filter out POLLHUP'ed file descriptors
> | 
> | So that we don't continue polling on vanished file descriptors, i.e.
> | file descriptors for events monitoring threads that exited.
> | 
> | I.e. the following 'trace' command now exits as expected, instead
> | of staying in an eternal loop:
> | 
> |       $ sleep 5s &
> |       $ trace -p `pidof sleep`
> | 
> | Reported-by: Jiri Olsa <jolsa@...hat.com>
> | Cc: Adrian Hunter <adrian.hunter@...el.com>
> | Cc: David Ahern <dsahern@...il.com>
> | Cc: Don Zickus <dzickus@...hat.com>
> | Cc: Frederic Weisbecker <fweisbec@...il.com>
> | Cc: Jiri Olsa <jolsa@...hat.com>
> | Cc: Mike Galbraith <efault@....de>
> | Cc: Namhyung Kim <namhyung@...nel.org>
> | Cc: Paul Mackerras <paulus@...ba.org>
> | Cc: Peter Zijlstra <peterz@...radead.org>
> | Cc: Stephane Eranian <eranian@...gle.com>
> | Link: http://lkml.kernel.org/n/tip-6qegv786zbf6i8us6t4rxug9@git.kernel.org
> | Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>
> | ---
> |  tools/perf/builtin-trace.c | 7 ++++++-
> |  1 file changed, 6 insertions(+), 1 deletion(-)
> | 
> | diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> | index b8fedf3..fe39dc6 100644
> | --- a/tools/perf/builtin-trace.c
> | +++ b/tools/perf/builtin-trace.c
> | @@ -2044,6 +2044,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> |  	int err = -1, i;
> |  	unsigned long before;
> |  	const bool forks = argc > 0;
> | +	bool draining = false;
> |  	char sbuf[STRERR_BUFSIZE];
> |  
> |  	trace->live = true;
> | @@ -2171,8 +2172,12 @@ next_event:
> |  	if (trace->nr_events == before) {
> |  		int timeout = done ? 100 : -1;
> |  
> | -		if (perf_evlist__poll(evlist, timeout) > 0)
> | +		if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
> | +			if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
> | +				draining = true;
> | +
> 
> If an fd gets into POLLHUP state, perf_evlist__filter_pollfd() removes
> ("puts") the mmap for the fd. We are seeing that sometimes (frequently)
> _all_ fds are in the POLLHUP state and hence their mmap->base are set
> to NULL.
> 
> 
> |  			goto again;
> 
> Now with this goto, we go back and call perf_evlist__mmap_read() which
> tries to access the freed mmaps.
> 
> Should there be another check to before reading the mmap again ?

Possibly, checking, but a similar algorithm should be in place for
'record', do you see any problems there? I.e. with 'perf record sleep
1'?

- Arnaldo
 
> I must add that I don't get the SIGSEGV on recent perf-core and the
> system where we get the crash, first runs into the following
> errors that we are still looking into (maybe related to "ppc64le"
> architecture).
> 
> 	Problems reading syscall 45 information
> 	Problems reading syscall 5 information
> 	Problems reading syscall 5 information
> 	Problems reading syscall 108 information
> 	Problems reading syscall 108 information
> 	Problems reading syscall 90 information
> 	Problems reading syscall 90 information
> 	Problems reading syscall 6 information
> 
> Unlike the SIGSEGV, these errors occur always.
> 
> | +		}
> |  	} else {
> |  		goto again;
> |  	}
> | -- 
> | 1.8.3.1
> | 
> 
> Following hack seems to fix the SIGSEGV, but then we completely ignore
> 'draining' flag.
> 
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index fb12645..ac25e16 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -2173,8 +2173,10 @@ next_event:
>                 int timeout = done ? 100 : -1;
>  
>                 if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
> -                       if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
> +                       if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0) {
>                                 draining = true;
> +                               goto out_disable;
> +                       }
>  
>                         goto again;
>                 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ