[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201112001129.GD3249@paulmck-ThinkPad-P72>
Date: Wed, 11 Nov 2020 16:11:29 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Anders Roxell <anders.roxell@...aro.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Alexander Potapenko <glider@...gle.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Jann Horn <jannh@...gle.com>,
Mark Rutland <mark.rutland@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
kasan-dev <kasan-dev@...glegroups.com>, rcu@...r.kernel.org,
peterz@...radead.org
Subject: Re: [PATCH] kfence: Avoid stalling work queue task without
allocations
On Wed, Nov 11, 2020 at 09:21:53PM +0100, Marco Elver wrote:
> On Wed, Nov 11, 2020 at 11:21AM -0800, Paul E. McKenney wrote:
> [...]
> > > > rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled
> > >
> > > Sadly, no, next-20201110 already included that one, and that's what I
> > > tested and got me all those warnings above.
> >
> > Hey, I had to ask! The only uncertainty I seee is the acquisition of
> > the lock in rcu_iw_handler(), for which I add a lockdep check in the
> > (untested) patch below. The other thing I could do is sprinkle such
> > checks through the stall-warning code on the assumption that something
> > RCU is calling is enabling interrupts.
> >
> > Other thoughts?
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index 70d48c5..3d67650 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -189,6 +189,7 @@ static void rcu_iw_handler(struct irq_work *iwp)
> >
> > rdp = container_of(iwp, struct rcu_data, rcu_iw);
> > rnp = rdp->mynode;
> > + lockdep_assert_irqs_disabled();
> > raw_spin_lock_rcu_node(rnp);
> > if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
> > rdp->rcu_iw_gp_seq = rnp->gp_seq;
>
> This assert didn't fire yet, I just get more of the below. I'll keep
> rerunning, but am not too hopeful...
Is bisection a possibility?
Failing that, please see the updated patch below. This adds a few more
calls to lockdep_assert_irqs_disabled(), but perhaps more helpfully dumps
the current stack of the CPU that the RCU grace-period kthread wants to
run on in the case where this kthread has been starved of CPU.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 70d48c5..d203ea0 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -189,6 +189,7 @@ static void rcu_iw_handler(struct irq_work *iwp)
rdp = container_of(iwp, struct rcu_data, rcu_iw);
rnp = rdp->mynode;
+ lockdep_assert_irqs_disabled();
raw_spin_lock_rcu_node(rnp);
if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
rdp->rcu_iw_gp_seq = rnp->gp_seq;
@@ -449,21 +450,32 @@ static void print_cpu_stall_info(int cpu)
/* Complain about starvation of grace-period kthread. */
static void rcu_check_gp_kthread_starvation(void)
{
+ int cpu;
struct task_struct *gpk = rcu_state.gp_kthread;
unsigned long j;
if (rcu_is_gp_kthread_starving(&j)) {
+ cpu = gpk ? task_cpu(gpk) : -1;
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
data_race(rcu_state.gp_flags),
gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
- gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
+ gpk ? gpk->state : ~0, cpu);
if (gpk) {
pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
pr_err("RCU grace-period kthread stack dump:\n");
+ lockdep_assert_irqs_disabled();
sched_show_task(gpk);
+ lockdep_assert_irqs_disabled();
+ if (cpu >= 0) {
+ pr_err("Stack dump where RCU grace-period kthread last ran:\n");
+ if (!trigger_single_cpu_backtrace(cpu))
+ dump_cpu_task(cpu);
+ }
+ lockdep_assert_irqs_disabled();
wake_up_process(gpk);
+ lockdep_assert_irqs_disabled();
}
}
}
Powered by blists - more mailing lists