[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200706210645.GJ9247@paulmck-ThinkPad-P72>
Date: Mon, 6 Jul 2020 14:06:45 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: joel@...lfernandes.org, rcu@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...com, mingo@...nel.org,
jiangshanlai@...il.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
fweisbec@...il.com, oleg@...hat.com,
Uladzislau Rezki <urezki@...il.com>
Subject: Re: [PATCH tip/core/rcu 03/17] rcu/tree: Skip entry into the page
allocator for PREEMPT_RT
On Thu, Jul 02, 2020 at 10:19:08PM +0200, Sebastian Andrzej Siewior wrote:
> On 2020-07-02 09:48:26 [-0700], Paul E. McKenney wrote:
> > On Thu, Jul 02, 2020 at 04:12:16PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2020-06-30 11:35:34 [-0700], Paul E. McKenney wrote:
> > > > > This is not going to work together with the "wait context validator"
> > > > > (CONFIG_PROVE_RAW_LOCK_NESTING). As of -rc3 it should complain about
> > > > > printk() which is why it is still disabled by default.
> > > >
> > > > Fixing that should be "interesting". In particular, RCU CPU stall
> > > > warnings rely on the raw spin lock to reduce false positives due
> > > > to race conditions. Some thought will be required here.
> > >
> > > I don't get this part. Can you explain/give me an example where to look
> > > at?
> >
> > Starting from the scheduler-clock interrupt's call into RCU,
> > we have rcu_sched_clock_irq() which calls rcu_pending() which
> > calls check_cpu_stall() which calls either print_cpu_stall() or
> > print_other_cpu_stall(), depending on whether the stall is happening on
> > the current CPU or on some other CPU, respectively.
> >
> > Both of these last functions acquire the rcu_node structure's raw ->lock
> > and expect to do printk()s while holding it.
>
> …
> > Thoughts?
>
> Okay. So in the RT queue there is a printk() rewrite which fixes this
> kind of things. Upstream the printk() interface is still broken in this
> regard and therefore CONFIG_PROVE_RAW_LOCK_NESTING is disabled.
> [Earlier the workqueue would also trigger a warning but this has been
> fixed as of v5.8-rc1.]
> This was just me explaining why this bad, what debug function would
> report it and why it is not enabled by default.
Whew!!! ;-)
> > > > > So assume that this is fixed and enabled then on !PREEMPT_RT it will
> > > > > complain that you have a raw_spinlock_t acquired (the one from patch
> > > > > 02/17) and attempt to acquire a spinlock_t in the memory allocator.
> > > >
> > > > Given that the slab allocator doesn't acquire any locks until it gets
> > > > a fair way in, wouldn't it make sense to allow a "shallow" allocation
> > > > while a raw spinlock is held? This would require yet another GFP_ flag,
> > > > but that won't make all that much of a difference in the total. ;-)
> > >
> > > That would be one way of dealing with. But we could go back to
> > > spinlock_t and keep the memory allocation even for RT as is. I don't see
> > > a downside of this. And we would worry about kfree_rcu() from real
> > > IRQ-off region once we get to it.
> >
> > Once we get to it, your thought would be to do per-CPU queuing of
> > memory from IRQ-off kfree_rcu(), and have IRQ work or some such clean
> > up after it? Or did you have some other trick in mind?
>
> So for now I would very much like to revert the raw_spinlock_t back to
> the spinlock_t and add a migrate_disable() just avoid the tiny
> possible migration between obtaining the CPU-ptr and acquiring the lock
> (I think Joel was afraid of performance hit).
Performance is indeed a concern here.
> Should we get to a *real* use case where someone must invoke kfree_rcu()
> from a hard-IRQ-off region then we can think what makes sense. per-CPU
> queues and IRQ-work would be one way of dealing with it.
It looks like workqueues can also be used, at least in their current
form. And timers.
Vlad, Joel, thoughts?
Thanx, Paul
Powered by blists - more mailing lists