linux-kernel - Re: [PATCH] ring-buffer: Fix slowpath of interrupted event

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20231220085206.97ad47a37ddc75e285c689f7@kernel.org>
Date: Wed, 20 Dec 2023 08:52:06 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Linux Trace Kernel
 <linux-trace-kernel@...r.kernel.org>, Mark Rutland <mark.rutland@....com>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Linus Torvalds
 <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] ring-buffer: Fix slowpath of interrupted event

On Tue, 19 Dec 2023 10:10:27 -0500
Steven Rostedt <rostedt@...dmis.org> wrote:

> On Tue, 19 Dec 2023 23:37:10 +0900
> Masami Hiramatsu (Google) <mhiramat@...nel.org> wrote:
> 
> > Yeah the above works, but my question is, do we really need this
> > really slow path? I mean;
> > 
> > > 	if (w == write - event length) {
> > > 		/* Nothing interrupted between A and C */
> > >  /*E*/		write_stamp = ts;
> > > 		delta = ts - after  
> > 
> > 	} else {
> > 		/*
> > 		  Something interrupted between A and C, which should record
> > 		  a new entry before this reserved entry with newer timestamp.
> > 		  we reuse it.
> > 		 */
> > 	 	ts = after = write_stamp;
> > 		delta = 0;
> > 	}
> > 
> > Isn't this enough?
> 
> I really like to avoid: delta = 0 when possible. It's basically what I do
> when I have no other options. Why?
> 
> Because let's just say you are looking at the time of interrupt events. If
> you just trace the entry of the interrupt, and that interrupt interrupted
> an event being written to, we have this:
> 
> Time starts at ts 1000, and we are able to calculate the delta of the
> interrupted event. And the trace will have:
> 
>  1000 - interrupt event
>  2000 - normal context event
>  2100 - next normal context event
> 
> Where we see the delta between the interrupt event and the normal context
> event was 1000. But if we just had it be delta = 0, it would be:
> 
>  1000 - interrupt event
>  1000 - normal context event
>  2100 - next normal context event
> 
> It will look like the time between the interrupt event and the normal
> context event was instant, and the time between the normal context event
> and the next normal context event was 1100 when in reality it was just 100.

Ah, OK. interrupt event can be the beginning of the interrupt handling
and normal context event seems not interrupted from the traced log.
OK, then we have to adjust the ts of normal context event. (interrupted
after reserving the event is OK because user can observe the interrupt
event on the log between normal context event and next one.)

Reveiewed-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>

Thank you!

> 
> The above scenario is rather common. Perhaps it happens 1% of the time. The
> case where we currently have delta = 0 only happens when the same event
> gets interrupted twice. That is, two separate interrupts came in, one
> before it allocated its space on the buffer, and one after it allocated.
> That's a much more race occurrence (0.01% possibly, or less). In fact my
> traces seldom even show it. Most of the time, even when doing function
> tracing, I have a hard time seeing this rare race.
> 
> So, if we can have delta=0 only 0.01% or less of the time, I rather do that
> then have it be delta=0 1% of the time.
> 
> Thanks for the review!
> 
> -- Steve


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>