linux-kernel - Re: [PATCH tip/core/rcu 02/15] rcu: Use timer as backstop for NOCB deferred wakeups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170726171801.5da044c3@vmware.local.home>
Date:   Wed, 26 Jul 2017 17:18:01 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        jiangshanlai@...il.com, dipankar@...ibm.com,
        akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
        josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
        dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com,
        oleg@...hat.com
Subject: Re: [PATCH tip/core/rcu 02/15] rcu: Use timer as backstop for NOCB
 deferred wakeups

On Tue, 25 Jul 2017 17:05:40 -0700
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:

> On Tue, Jul 25, 2017 at 06:17:10PM -0400, Steven Rostedt wrote:
> > On Tue, 25 Jul 2017 12:18:14 -0700
> > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> >   
> > > On Tue, Jul 25, 2017 at 02:12:20PM -0400, Steven Rostedt wrote:  
> > > > On Mon, 24 Jul 2017 14:44:31 -0700
> > > > "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> > > >     
> > > > > The handling of RCU's no-CBs CPUs has a maintenance headache, namely
> > > > > that if call_rcu() is invoked with interrupts disabled, the rcuo kthread
> > > > > wakeup must be defered to a point where we can be sure that scheduler
> > > > > locks are not held.  Of course, there are a lot of code paths leading
> > > > > from an interrupts-disabled invocation of call_rcu(), and missing any
> > > > > one of these can result in excessive callback-invocation latency, and
> > > > > potentially even system hangs.    
> > > > 
> > > > What about using irq_work? That's what perf and ftrace use for such a
> > > > case.    
> > > 
> > > I hadn't looked at irq_work before, thank you for the pointer!
> > > 
> > > I nevertheless believe that timers work better in this particular case
> > > because they can be cancelled (which appears to be the common case), they  
> > 
> > Is the common case here that it doesn't trigger? That is, the
> > del_timer() will be called?  
> 
> If you have lots of call_rcu() invocations, many of them will be invoked
> with interrupts enabled, and a later one with interrupts enabled will
> take care of things for the earlier ones.  So there can be workloads
> where this is the case.

Note, only the first irq_work called will take action. The other
callers will see that a irq_work is pending and will not reivoke one.

> 
> > > normally are not at all time-critical, and because running in softirq
> > > is just fine -- no need to run out of the scheduling-clock interrupt.  
> > 
> > irq_work doesn't always use the scheduling clock. IIRC, it will simply
> > trigger a interrupt (if the arch supports it), and the work will be
> > done when interrupts are enabled (the interrupt that will do the work
> > will trigger)  
> 
> Ah, OK, so scheduling clock is just the backstop.  Still, softirq
> is a bit nicer to manage than hardirq.

Still requires a hard interrupt (timer) (thinking of NOHZ FULL where
this does matter).

> 
> > > Seem reasonable?  
> > 
> > Don't know. With irq_work, you just call it and forget about it. No
> > need to mod or del timers.  
> 
> But I could have a series of call_rcu() invocations with interrupts
> disabled, so I would need to interact somehow with the irq_work handler.
> Either that or dynamically allocate the needed data structure.
> 
> Or am I missing something here?

You treat it just like you are with the timer code. You have a irq_work
struct attached to your rdp descriptor. And call irq_work_run() when
interrupts are disabled. If it hasn't already been invoked it will
invoke one. Then the irq_work handler will look at the rdp attached to
the irq_work (container_of()), and then wake the associated thread.

It is much lighter weight than a timer setup.

-- Steve