linux-kernel - Re: ftrace performance (sched events): cyclictest shows 25% more latency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180709113257.79152dd0@gandalf.local.home>
Date:   Mon, 9 Jul 2018 11:32:57 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Claudio <claudio.fontana@...wa.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: ftrace performance (sched events): cyclictest shows 25% more
 latency

On Fri, 6 Jul 2018 08:22:01 +0200
Claudio <claudio.fontana@...wa.com> wrote:

> Hello all,
> 
> I have been experimenting with the idea of leaving ftrace enabled, with sched events,
> on production systems.
> 
> The main concern that I am having at the moment is about the impact on the system.
> Enabling the sched events that I currently need for the tracing application
> seems to slow down context-switches considerably, and make the system less responsive.
> 
> I have tested with cyclictest on the mainline kernel, and noticed an increase of min, avg latencies of around 25%.
> 
> Is this expected?
> 
> Some initial investigation into ftrace seems to point at the reservation and commit of the events into the ring buffer
> as the highest sources of overhead, while event parameters copying, including COMM, does not seem to have any noticeable effect
> relative to those costs.
> 
> I have been running 20 times the following test, and thrown away the first results:
> 
> $ sudo ./cyclictest --smp -p95 -m -s -N -l 100000 -q

OK, I just noticed that you are using -N which means all numbers are in
nanoseconds.

> 
> $ uname -a
> Linux claudio-HP-ProBook-470-G5 4.18.0-rc3+ #3 SMP Tue Jul 3 15:50:30 CEST 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> For brevity, this is a comparison of one test's results. All other test results show the same ~25% increase.
> 
> On the left side, the run without ftrace sched events, on the right side with ftrace sched events enabled.
> 
> CPU    Count      Min       Act        Avg        Max        Count      Min-ftrace Act-ftrace Avg-ftrace Max-ftrace
> 0     100000      2339       2936       2841     139478     100000       2900       3182       3566      93056
> 1      66742      2365       3386       2874      93639      66750       2959       3786       3646     154074
> 2      50080      2376       3058       2910     196221      50097       2997       4209       3655      18707
> 3      40076      2394       3461       2931      17914      40091       3006       4417       3750      17159
> 4      33404      2371       3612       2834      15336      33419       2997       3836       3594      23172
> 5      28635      2387       3313       2885      25863      28649       2995       3795       3647       9956
> 6      25058      2384       3428       2968      12162      25071       3051       4366       3719      18151
> 7      22275      2381       2859       2982      10706      22287       3046       5078       3825      10781
> 
> I would be thankful for any advice or comments on this,
> especially with the goal in mind to lower as much as possible the runtime impact on the system.

Thus, the tracing is causing the wakeup time to be an average of 0.8us
longer.

Yes that is expected.

-- Steve