linux-kernel - Re: [PATCH] perf sched: kill time stamp discrepancy between script and latency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Sat, 1 Oct 2016 14:15:31 +0900
From:   Namhyung Kim <namhyung@...nel.org>
To:     Joonwoo Park <joonwoop@...eaurora.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf sched: kill time stamp discrepancy between script
 and latency

Hi Joonwoo,

On Wed, Sep 28, 2016 at 07:25:26PM -0700, Joonwoo Park wrote:
> Perf sched latency is handy to find out the maximum sched latency and
> the time stamp of the event.  After running sched latency, if a found
> latency looks suspicious it's quite reasonable to run perf script
> subsequently and search with the time stamp given by perf sched latency
> to continue further debugging.  However, at present, it's possible the
> time stamp given by perf sched latency cannot be found in the trace
> output by perf script because perf sched latency converts the event
> time from ns to ms as double float and prints it with printf which
> does banker's rounding as opposed to perf script doesn't.
> 
>   For example:
> 
>    0x750ff0 [0x80]: event: 9
>    <snip>
>    2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
>     ... thread: hackbench:15281
>    <snip>
> 
>     $ perf sched -i perf.data latency | grep hackbench
>       hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s
> 
>     $ perf script -i perf.data | grep "1858\.303050"
> 
>     $ perf script -i perf.data | grep "1858\.303049"
>       hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120
> 
> Fix perf latency to print out time stamp without rounding to avoid such
> discrepancy.
> 
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
> Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
> Cc: Steven Rostedt <rostedt@...dmis.org>
> Cc: Namhyung Kim <namhyung@...nel.org>
> Cc: linux-kernel@...r.kernel.org
> Signed-off-by: Joonwoo Park <joonwoop@...eaurora.org>
> ---
> 
> I was tempted to get rid of all u64 to double casting in the function
> output_lat_thread but didn't because there is no data loss as of
> today.  Double float gives at least 15 significant decimal digits
> precision while the function requires only 14 significant digits precision.
> 
> $ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
> 14
> 
>  tools/lib/traceevent/event-parse.h |  1 +
>  tools/perf/builtin-sched.c         | 12 ++++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
> index 9ffde37..f42703c 100644
> --- a/tools/lib/traceevent/event-parse.h
> +++ b/tools/lib/traceevent/event-parse.h
> @@ -174,6 +174,7 @@ struct pevent_plugin_option {
>  
>  #define NSECS_PER_SEC		1000000000ULL
>  #define NSECS_PER_USEC		1000ULL
> +#define MSECS_PER_SEC		1000ULL
>  
>  enum format_flags {
>  	FIELD_IS_ARRAY		= 1,
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index afa0576..e5cf51a 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  	int i;
>  	int ret;
>  	u64 avg;
> +	u64 max_lat_at_sec, max_lat_at_msec;

Isn't it usec rathen than msec? :)

>  
>  	if (!work_list->nb_atoms)
>  		return;
> @@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  
>  	avg = work_list->total_lat / work_list->nb_atoms;
>  
> -	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
> +	/*
> +	 * Avoid round up with printf to prevent event time discrepency
> +	 * between sched script and latency.
> +	 */
> +	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
> +	max_lat_at_msec = (work_list->max_lat_at -
> +			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
> +	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",

Maybe you'd better to be in sync with the script code:

	if (PRINT_FIELD(TIME)) {
		nsecs = sample->time;
		secs = nsecs / NSECS_PER_SEC;
		nsecs -= secs * NSECS_PER_SEC;
		usecs = nsecs / NSECS_PER_USEC;
		if (nanosecs)
			printf("%5lu.%09llu: ", secs, nsecs);
		else
			printf("%5lu.%06lu: ", secs, usecs);
	}

Thanks,
Namhyung


>  	      (double)work_list->total_runtime / 1e6,
>  		 work_list->nb_atoms, (double)avg / 1e6,
>  		 (double)work_list->max_lat / 1e6,
> -		 (double)work_list->max_lat_at / 1e9);
> +		 max_lat_at_sec, max_lat_at_msec);
>  }
>  
>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>