linux-kernel - Re: Tasks RCU vs Preempt RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180520191846.GA248075@joelaf.mtv.corp.google.com>
Date:   Sun, 20 May 2018 12:18:46 -0700
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        byungchul.park@....com, mathieu.desnoyers@...icios.com,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: Tasks RCU vs Preempt RCU

On Sun, May 20, 2018 at 11:28:43AM -0400, Steven Rostedt wrote:
> 
> [ Steve interrupts his time off ]

Hope you're enjoying your vacation :)

> On Sat, 19 May 2018 17:49:38 -0700
> "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> 
> > I suggested to Steven that the rcu_read_lock() and rcu_read_unlock() might
> > be outside of the trampoline, but this turned out to be infeasible.  Not
> > that I remember why!  ;-)
> 
> Because the trampoline itself is what needs to be freed. The trampoline
> is what mcount/fentry or an optimized kprobe jumps to.
> 
> 
> <func>:
> 	nop
> 
> [ enable function tracing ]
> 
> <func>:
> 	call func_tramp --> set up stack
> 			    call function_tracer()
> 			    pop stack
> 			    ret
> 
> 			    ^^^^^
> 			    This is the trampoline
> 
> There's no way to know when a task will be on the trampoline or not.
> The trampoline is allocated, and we need RCU_tasks to know when we can
> free it. The only way to make a "wrapper" is to modify more of the code
> text to do whatever before calling the trampoline, which is
> impractical.
> 
> The allocated trampolines were added as an optimization, where two
> registered callback functions from ftrace that are attached to two
> different functions don't call the same trampoline which would have to
> do a loop and a hash lookup to know what callback to call per function.
> If a callback is the only one attached to a specific function, then a
> trampoline is allocated and will call that callback directly, keeping
> the overhead down.

Right, I saw your trampoline prototype tree. I understand how it works now,
thanks.

> There is no feasible way to know when a task is on a trampoline
> without adding overhead that negates the speed up we receive by making
> individual trampolines to begin with.

Are you speaking of time overhead or space overhead, or both?

Just thinking out loud and probably some food for thought..

The rcu_read_lock/unlock primitive are extrememly fast, so I don't personally
think there's a time hit.

Could we get around the trampoline code == data issue by say using a
multi-stage trampoline like so? :

 	call func_tramp --> (static
			    trampoline)               (dynamic trampoline)
			    rcu_read_lock()  -------> set up stack
		 	                              call function_tracer()
 			                              pop stack
                            rcu_read_unlock() <------ ret
 
I know there's probably more to it than this, but conceptually atleast, it
feels like all the RCU infrastructure is already there to handle preemption
within a trampoline and it would be cool if the trampoline were as shown
above for the dynamically allocated trampolines. Atleast I feel it will be
faster than the pre-trampoline code that did the hash lookups / matching to
call the right function callbacks, and could help eliminiate need for the
RCU-tasks subsystem and its kthread then.

If you still feel its nots worth it, then that's okay too and clearly the
RCU-tasks has benefits such as a simpler trampoline implementation..

thanks!

- Joel