linux-kernel - Re: Use case for TASKS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20170519133550.GD3956@linux.vnet.ibm.com>
Date:   Fri, 19 May 2017 06:35:50 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     rostedt@...dmis.org, linux-kernel@...r.kernel.org,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Use case for TASKS_RCU

On Fri, May 19, 2017 at 08:23:31AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> 
> > On Tue, May 16, 2017 at 08:22:33AM +0200, Ingo Molnar wrote:
> > > 
> > > * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> > > 
> > > > Hello!
> > > > 
> > > > The question of the use case for TASKS_RCU came up, and here is my
> > > > understanding.  Steve will not be shy about correcting any misconceptions
> > > > I might have.  ;-)
> > > > 
> > > > The use case is to support freeing of trampolines used in tracing/probing
> > > > in CONFIG_PREEMPT=y kernels.  It is necessary to wait until any task
> > > > executing in the trampoline in question has left it, taking into account
> > > > that the trampoline's code might be interrupted and preempted.  However,
> > > > the code in the trampolines is guaranteed never to context switch.
> > > > 
> > > > Note that in CONFIG_PREEMPT=n kernels, synchronize_sched() suffices.
> > > > It is therefore tempting to think in terms of disabling preemption across
> > > > the trampolines, but there is apparently not enough room to accommodate
> > > > the needed preempt_disable() and preempt_enable() in the code invoking
> > > > the trampoline, and putting the preempt_disable() and preempt_enable()
> > > > in the trampoline itself fails because of the possibility of preemption
> > > > just before the preempt_disable() and just after the preempt_enable().
> > > > Similar reasoning rules out use of rcu_read_lock() and rcu_read_unlock().
> > > 
> > > So how was this solved before TASKS_RCU? Also, nothing uses call_rcu_tasks() at 
> > > the moment, so it's hard for me to review its users. What am I missing?
> > 
> > Before TASKS_RCU, the trampolines were just leaked when CONFIG_PREEMPT=y.
> > 
> > Current mainline kernel/trace/ftrace.c uses synchronize_rcu_tasks().
> > So yes, currently one user.
> 
> So why not schedule a worklet on every CPU to drive the trampoline freeing? To 
> guarantee that nothing was preempted it could run at SCHED_IDLE and could observe 
> nr_running from the worklet and use a short timeout loop. Batching and hysteresis 
> would ensure that this is only running rarely in practice.
> 
> It doesn't have to be fast or particularly elegant, but it could use existing 
> kernel facilites just fine: it's a corner case cost and quirk of our live kernel 
> text modifying trampoline code and our current CONFIG_PREEMPT=y model.
> 
> I.e. don't make it an RCU facility that complicates not just the RCU code but has 
> various costs in generic code as well:
> 
> kernel/exit.c:  TASKS_RCU(int tasks_rcu_i);
> kernel/exit.c:  TASKS_RCU(preempt_disable());
> kernel/exit.c:  TASKS_RCU(tasks_rcu_i = __srcu_read_lock(&tasks_rcu_exit_srcu));
> kernel/exit.c:  TASKS_RCU(preempt_enable());
> kernel/exit.c:  TASKS_RCU(__srcu_read_unlock(&tasks_rcu_exit_srcu, tasks_rcu_i));
> 
> I.e. I question that this should be a generic RCU facility.

Simpler would be better!

However, is it really guaranteed that one SCHED_IDLE thread cannot
preempt another?  If not, then the trampoline-freeing SCHED_IDLE thread
might preempt some other SCHED_IDLE thread in the middle of a trampoline.
I am not seeing anything that prevents such preemption, but it is rather
early local time, so I could easily be missing something.

However, if SCHED_IDLE threads cannot preempt other threads, even other
SCHED_IDLE threads, then your approach sounds quite promising to me.

Steve, Peter, thoughts?

							Thanx, Paul