[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160318102723.671fd983@gandalf.local.home>
Date: Fri, 18 Mar 2016 10:27:23 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Jiri Olsa <jolsa@...nel.org>
Cc: lkml <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if
necessary
Patches 4 and 5 seem agnostic to patches 1-3. I'll pull 4,5 into my
tree and 1-3 can go though tip. I'd like to run theses through my tests.
OK?
-- Steve
On Wed, 16 Mar 2016 15:34:33 +0100
Jiri Olsa <jolsa@...nel.org> wrote:
> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
>
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
>
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
>
> Counting:
> # time ./perf stat -e ftrace:function -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 370,663 ftrace:function
>
> 1.401427505 seconds time elapsed
>
> real 3m51.743s
> user 0m0.023s
> sys 3m48.569s
>
> Sampling:
> # time ./perf record -e ftrace:function -a sleep 1
> [ perf record: Woken up 0 times to write data ]
> Warning:
> Processed 141200 events and lost 5 chunks!
>
> [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
>
> real 2m31.429s
> user 0m0.213s
> sys 2m29.494s
>
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
>
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
>
> Counting:
> # time ./perf stat -e ftrace:function -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 398,571 ftrace:function
>
> 1.377503733 seconds time elapsed
>
> real 0m2.787s
> user 0m0.005s
> sys 0m1.883s
>
> Sampling:
> # time ./perf record -e ftrace:function -a sleep 1
> [ perf record: Woken up 0 times to write data ]
> Warning:
> Processed 261730 events and lost 9 chunks!
>
> [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
>
> real 1m31.948s
> user 0m0.309s
> sys 1m32.051s
>
> Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> ---
Powered by blists - more mailing lists