lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <000001d1e87d$9b28e680$d17ab380$@hitachi.com>
Date:	Thu, 28 Jul 2016 12:10:33 +0900
From:	"Tomoki Sekiyama" <tomoki.sekiyama.qu@...achi.com>
To:	"'David Ahern'" <dsahern@...il.com>, <linux-kernel@...r.kernel.org>
Cc:	<ltc-kernel@...ml.intra.hitachi.co.jp>,
	森谷真寿美 / MORITANI,MASUMI 
	<masumi.moritani.ju@...achi.com>, "Jiri Olsa" <jolsa@...nel.org>,
	"Namhyung Kim" <namhyung@...nel.org>,
	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	"Masami Hiramatsu" <mhiramat@...nel.org>
Subject: RE: [PATCH] perf sched: fix wrong conversion of task state

On 2016/7/28 1:50, David Ahern wrote:
>>>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>>>> index 0dfe8df..eb2f7f4 100644
>>>> --- a/tools/perf/builtin-sched.c
>>>> +++ b/tools/perf/builtin-sched.c
>>>> @@ -71,6 +71,7 @@ struct sched_atom {
>>>>  };
>>>>
>>>>  #define TASK_STATE_TO_CHAR_STR "RSDTtZXxKWP"
>>>> +#define TASK_STATE_MASK 0x7ff
>>>
>>> The mask should not be needed and looking at top of tree there are 2 new
>>> states (N and n) that need to be added.
>>
>> I couldn't find the state for 'n'. Where can I find it?
>
> Check Linus' tree -- top of tree; commit 7dc603c9028ea.

Thanks, I was failing to git pull by some network reason..
Will add 'n' too.

>> I have rechecked the code and found that __trace_sched_switch_state()
uses
>> TASK_STATE_MAX for preempted tasks.
>
>since TASK_STATE_MAX is kernel version dependent perf should not rely on
it.

Hmm, that is true (actually 'n' is added recently).
But that means we cannot handle preemption correctly as far as
sched:sched_switch
event uses TASK_STATE_MAX to mark preempted tasks.

Should we stop using TASK_STATE_MAX for preempted tasks in ftrace and
use (1 << 63) or something that doesn't change on kernel version instead?


This causes missing some context switches information by 'perf sched
latency'.
For example, running 2 'yes' commands (inifinitely call write(2)) on the
same
cpu by following steps results in the incomplete result:

$ taskset -c 0 yes > /dev/null &
$ taskset -c 0 yes > /dev/null &
# perf sched record sleep 1
# perf sched latency -p -C 0

like:

----------------------------------------------------------------------
  Task                  |   Runtime ms  | Switches | Average delay ms |
 ----------------------------------------------------------------------
  yes:14187             |    499.705 ms |       17 | avg:   12.981 ms |
  yes:14188             |    500.350 ms |       14 | avg:   12.023 ms |
  gnome-terminal-:12722 |      0.285 ms |        3 | avg:    0.025 ms |
...

where avg delay * switches for yes commands don't cover expected total delay
500 ms although one make another delayed to be executed.
(12.981 ms * 17 = 220.677 ms << 500 ms)

With the patch, perf sched latency shows:
Now 12.838 ms * 39 = 500.682 ms as expected.

----------------------------------------------------------------------
  Task                  |   Runtime ms  | Switches | Average delay ms |
 ----------------------------------------------------------------------
  yes:14187             |    499.705 ms |       39 | avg:   12.838 ms |
  yes:14188             |    500.350 ms |       40 | avg:   12.506 ms |
  gnome-terminal-:12722 |      0.285 ms |        3 | avg:    0.025 ms |
...


Thanks,
Tomoki Sekiyama


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ