lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <504085e9-bf91-4948-a158-abae5dcb276a@efficios.com>
Date: Fri, 19 Jan 2024 09:40:27 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>, LKML
 <linux-kernel@...r.kernel.org>,
 Linux Trace Kernel <linux-trace-kernel@...r.kernel.org>
Cc: Masami Hiramatsu <mhiramat@...nel.org>,
 Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH] ring-buffer: Simplify reservation with try_cmpxchg() loop

On 2024-01-18 18:12, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@...dmis.org>
> 
> Instead of using local_add_return() to reserve the ring buffer data,
> Mathieu Desnoyers suggested using local_cmpxchg(). This would simplify the
> reservation with the time keeping code.

I admire the effort of trying to simplify the Ftrace ring buffer by bringing
over ideas that worked well for LTTng. :-) As reviewer of the tracing subsystem,
I certainly welcome the simplifications.

> Although, it does not get rid of the double time stamps (before_stamp and
> write_stamp), using cmpxchg() does get rid of the more complex case when
> an interrupting event occurs between getting the timestamps and reserving
> the data, as when that happens, it just tries again instead of dealing
> with it.

I understand that the reason why you need the before/after stamps and their
associated complexity is because the Ftrace ring buffer ABI encodes event
timestamps as delta from the previous event within the buffer as a mean of
compressing the timestamp fields. If the delta cannot be represented in a
given number of bits, then it inserts a 64-bit timestamp (not sure if that
one is absolute or a delta from previous event).

This timestamp encoding as delta between events introduce a strong
inter-dependency between consecutive (nested) events, and is the reason
why you are stuck with all this timestamp before/after complexity.

The Common Trace Format specifies (and LTTng implements) a different way
to achieve the same ring buffer space-savings achieved with timestamp deltas
while keeping the timestamps semantically absolute from a given reference,
hence without all the before/after timestamp complexity. You can see the
clock value decoding procedure in the CTF2 SPEC RC9 [1] document. The basic
idea on the producer side is to record the low-order bits of the current
timestamp in the event header (truncating the higher order bits), and
fall back on a full 64-bit value if the number of low-order bits overflows
from the previous timestamp is more than 1, or if it is impossible to figure
out precisely the timestamp of the previous event due to a race. This achieves
the same space savings as delta timestamp encoding without introducing the
strong event inter-dependency.

The fact that Ftrace exposes this ring buffer binary layout as a user-space
ABI makes it tricky to move to the Common Trace Format timestamp encoding.
There are clearly huge simplifications that could be made by moving to this
scheme though. Is there any way to introduce a different timestamp encoding
scheme as an extension to the Ftrace ring buffer ABI ? This would allow us to
introduce this simpler scheme and gradually phase out the more complex delta
encoding when no users are left.

Thoughts ?

Thanks,

Mathieu

[1] https://diamon.org/ctf/files/CTF2-SPECRC-9.0.html#clk-val-update

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ