[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130727121518.GA1038@krava.brq.redhat.com>
Date:	Sat, 27 Jul 2013 14:15:18 +0200
From:	Jiri Olsa <jolsa@...hat.com>
To:	David Ahern <dsahern@...il.com>
Cc:	acme@...stprotocols.net, linux-kernel@...r.kernel.org,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...nel.org>, Mike Galbraith <efault@....de>,
	Namhyung Kim <namhyung@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH] perf: sample after exit loses thread correlation
On Fri, Jul 26, 2013 at 04:04:14PM -0600, David Ahern wrote:
> Occassionally events (e.g., context-switch, sched tracepoints) are losing
> the conversion of sample data associated with a thread. For example:
> 
> $ perf record -e sched:sched_switch -c 1 -a -- sleep 5
> $ perf script
> <selected events shown>
>     ls 30482 [000] 1379727.583037: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
>     ls 30482 [000] 1379727.586339: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
> :30482 30482 [000] 1379727.589462: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
> 
> The last line lost the conversion from tid to comm. If you look at the events
> (perf script -D) you see why - SAMPLE event is generated after the EXIT:
> 
> 0 1379727589449774 0x1540b0 [0x38]: PERF_RECORD_EXIT(30482:30482):(30482:30482)
> 0 1379727589462497 0x1540e8 [0x80]: PERF_RECORD_SAMPLE(IP, 1): 30482/30482: 0xffffffff816416f1 period: 1 addr: 0
> ... thread: :30482:30482
> 
> When perf processes the EXIT event the thread is moved to the dead_threads
> list. When the SAMPLE event is processed no thread exists for the pid so a new
> one is created by machine__findnew_thread.
> 
> This patch addresses the problem by saving the exit time and checking the
> dead_threads list for the requested tid. If the time passed into
> machine_findnew_thread is non-0 the dead_threads list is walked looking for
> the tid. If the thread struct associated with the tid exited within 1 msec
> of the time passed in the thread is considered a match and returned.
> 
> If samples do not contain timestamps then sample->time will be 0 and the
> dead_threads list will not be checked. -1 can be used to force always checking
> the dead_threads list and returning a match.
> 
> With this patch we get the previous example shows:
> 
> ls 30482 [000] 1379727.583037: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
> ls 30482 [000] 1379727.586339: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
> ls 30482 [000] 1379727.589462: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
> 
> and
> 
> 0 1379727589449774 0x1540b0 [0x38]: PERF_RECORD_EXIT(30482:30482):(30482:30482)
> 0 1379727589462497 0x1540e8 [0x80]: PERF_RECORD_SAMPLE(IP, 1): 30482/30482: 0xffffffff816416f1 period: 1 addr: 0
> ... thread: ls:30482
> 
> v2: Rebased to latest perf/core branch. Changed time comparison to use
>     a macro which explicitly shows the time basis
> 
> Signed-off-by: David Ahern <dsahern@...il.com>
> Cc: Frederic Weisbecker <fweisbec@...il.com>
> Cc: Ingo Molnar <mingo@...nel.org>
> Cc: Jiri Olsa <jolsa@...hat.com>
tested and
Acked-by: Jiri Olsa <jolsa@...hat.com>
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
