linux-kernel - Re: Use case for TASKS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170519062331.52dhungzvcsdxdgo@gmail.com>
Date:   Fri, 19 May 2017 08:23:31 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     rostedt@...dmis.org, linux-kernel@...r.kernel.org,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Use case for TASKS_RCU


* Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:

> On Tue, May 16, 2017 at 08:22:33AM +0200, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> > 
> > > Hello!
> > > 
> > > The question of the use case for TASKS_RCU came up, and here is my
> > > understanding.  Steve will not be shy about correcting any misconceptions
> > > I might have.  ;-)
> > > 
> > > The use case is to support freeing of trampolines used in tracing/probing
> > > in CONFIG_PREEMPT=y kernels.  It is necessary to wait until any task
> > > executing in the trampoline in question has left it, taking into account
> > > that the trampoline's code might be interrupted and preempted.  However,
> > > the code in the trampolines is guaranteed never to context switch.
> > > 
> > > Note that in CONFIG_PREEMPT=n kernels, synchronize_sched() suffices.
> > > It is therefore tempting to think in terms of disabling preemption across
> > > the trampolines, but there is apparently not enough room to accommodate
> > > the needed preempt_disable() and preempt_enable() in the code invoking
> > > the trampoline, and putting the preempt_disable() and preempt_enable()
> > > in the trampoline itself fails because of the possibility of preemption
> > > just before the preempt_disable() and just after the preempt_enable().
> > > Similar reasoning rules out use of rcu_read_lock() and rcu_read_unlock().
> > 
> > So how was this solved before TASKS_RCU? Also, nothing uses call_rcu_tasks() at 
> > the moment, so it's hard for me to review its users. What am I missing?
> 
> Before TASKS_RCU, the trampolines were just leaked when CONFIG_PREEMPT=y.
> 
> Current mainline kernel/trace/ftrace.c uses synchronize_rcu_tasks().
> So yes, currently one user.

So why not schedule a worklet on every CPU to drive the trampoline freeing? To 
guarantee that nothing was preempted it could run at SCHED_IDLE and could observe 
nr_running from the worklet and use a short timeout loop. Batching and hysteresis 
would ensure that this is only running rarely in practice.

It doesn't have to be fast or particularly elegant, but it could use existing 
kernel facilites just fine: it's a corner case cost and quirk of our live kernel 
text modifying trampoline code and our current CONFIG_PREEMPT=y model.

I.e. don't make it an RCU facility that complicates not just the RCU code but has 
various costs in generic code as well:

kernel/exit.c:  TASKS_RCU(int tasks_rcu_i);
kernel/exit.c:  TASKS_RCU(preempt_disable());
kernel/exit.c:  TASKS_RCU(tasks_rcu_i = __srcu_read_lock(&tasks_rcu_exit_srcu));
kernel/exit.c:  TASKS_RCU(preempt_enable());
kernel/exit.c:  TASKS_RCU(__srcu_read_unlock(&tasks_rcu_exit_srcu, tasks_rcu_i));

I.e. I question that this should be a generic RCU facility.

Thanks,

	Ingo