linux-kernel - Re: [RFC][PATCH] ring-buffer: Have nested events still record running time stamp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200625155850.13be7bfa@oasis.local.home>
Date:   Thu, 25 Jun 2020 15:58:50 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Yordan Karadzhov <y.karadz@...il.com>,
        Tzvetomir Stoyanov <tz.stoyanov@...il.com>,
        Tom Zanussi <zanussi@...nel.org>,
        Jason Behmer <jbehmer@...gle.com>,
        Julia Lawall <julia.lawall@...ia.fr>,
        Clark Williams <williams@...hat.com>,
        bristot <bristot@...hat.com>, Daniel Wagner <wagi@...om.org>,
        Darren Hart <dvhart@...are.com>,
        Jonathan Corbet <corbet@....net>,
        "Suresh E. Warrier" <warrier@...ux.vnet.ibm.com>
Subject: Re: [RFC][PATCH] ring-buffer: Have nested events still record
 running time stamp

On Thu, 25 Jun 2020 15:35:02 -0400 (EDT)
Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:

> > 
> > Well, write_stamp is updated via local64, which I believe handles this
> > for us. I probably should make before_stamp handle it as well.  
> 
> By looking at local64 headers, it appears that 32-bit rely on atomic64,
> which on x86 is implemented with LOCK; cmpxchg8b for 586+ (which is AFAIK
> painfully slow) and with cli/sti for 386/486 (which is not nmi-safe).
> 
> For all other 32-bit architectures, the generic atomic64.h implements 64-bit
> atomics using spinlocks with irqs off, which seems to also bring considerable
> overhead, in addition to be non-reentrant with respect to NMI-like interrupts,
> e.g. FIQ on ARM32.
> 
> That seems at odds with the performance constraints of ftrace's ring buffer.
> 
> Those performance and reentrancy concerns are why I always stick to local_t
> (long), and never use a full 64-bit type for anything that has to
> do with concurrent store/load between execution contexts in lttng.

If this is an issue, I'm sure I can make my own wrappers for
"time_local()", and implement something that you probably do. Because,
we only need to worry about wrapping the 32 bit lower number, as that
only happens every 4 seconds. But that is an implementation detail, it
doesn't affect the overall design correctness.

But it is something I should definitely look in to.

> 
> > 
> >   
> >>   
> >> >	 * a full time stamp (this can turn into a time extend which
> >> >	is
> >> >	 * just an extended time delta but fill up the extra space).
> >> >	 */
> >> >	if (after != before)
> >> >		abs = true;
> >> > 
> >> >	ts = clock();
> >> > 
> >> >	/* Now update the before_stamp (everyone does this!) */
> >> > [B]	WRITE_ONCE(before_stamp, ts);
> >> > 
> >> >	/* Read the current next_write and update it to what we want
> >> >	write
> >> >	 * to be after we reserve space. */
> >> > 	next = READ_ONCE(next_write);
> >> >	WRITE_ONCE(next_write, w + len);
> >> > 
> >> >	/* Now reserve space on the buffer */
> >> > [C]	write = local_add_return(len, write_tail);  
> >> 
> >> So the reservation is not "just" an add instruction, it's actually an
> >> xadd on x86. Is that really faster than a cmpxchg ?  
> > 
> > I believe the answer is still yes. But I can run some benchmarks to
> > make sure.  
> 
> This would be interesting to see, because if xadd and cmpxchg have
> similar overhead, then going for a cmpxchg-loop for the space
> reservation could vastly decrease the overall complexity of this
> timestamp+space reservation algorithm.

It would most likely cause userspace breakage, and that would be a show
stopper.

But still good to see.

Thanks for the review.

-- Steve