linux-kernel - Overhead of ring buffer in Ftrace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJrp0nBcm-++zejsjuM-q4=0eKzXOOUMUNjnTzfmt6FgDiiy1Q@mail.gmail.com>
Date:   Fri, 2 Aug 2019 01:41:55 -0400
From:   Fang Zhou <timchou.hit@...il.com>
To:     linux-kernel@...r.kernel.org
Subject: Overhead of ring buffer in Ftrace

Hi all,

I’m currently using Ftrace with tracepoints to trace several events in
kernel. But I found the tracing overhead is a little high.

I found the major overhead comes from
“local_dec(&cpu_buffer->committing);” in rb_end_commit() function.
local_dec() will invoke atomic_long_dec(), which finally performs
LOCK_PREFIX plus "DECQ" on this variable.

I'm a little confused. cpu_buffer is a per-cpu buffer. Therefore, I
cannot come up with a scenario that two core runs INC or DEC on the
same per-cpu value at the same time.
So, why do we use such heavy-overhead operation here? Can we just
simply use "DECQ" without LOCK_PREFIX?

Thanks,
Tim