[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X/w+yJmCBnDWxtoE@hirez.programming.kicks-ass.net>
Date: Mon, 11 Jan 2021 13:04:24 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: "Paul E . McKenney" <paulmck@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, stable@...r.kernel.org
Subject: Re: [RFC PATCH 4/8] rcu/nocb: Trigger self-IPI on late deferred wake
up before user resume
On Sat, Jan 09, 2021 at 03:05:32AM +0100, Frederic Weisbecker wrote:
> Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP
> kthread (rcuog) to be serviced.
>
> Unfortunately the call to rcu_user_enter() is already past the last
> rescheduling opportunity before we resume to userspace or to guest mode.
> We may escape there with the woken task ignored.
>
> The ultimate resort to fix every callsites is to trigger a self-IPI
> (nohz_full depends on IRQ_WORK) that will trigger a reschedule on IRQ
> tail or guest exit.
>
> Eventually every site that want a saner treatment will need to carefully
> place a call to rcu_nocb_flush_deferred_wakeup() before the last explicit
> need_resched() check upon resume.
>
> Reported-by: Paul E. McKenney <paulmck@...nel.org>
> Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf)
> Cc: stable@...r.kernel.org
> Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar<mingo@...nel.org>
> Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> ---
> kernel/rcu/tree.c | 22 +++++++++++++++++++++-
> kernel/rcu/tree.h | 2 +-
> kernel/rcu/tree_plugin.h | 25 ++++++++++++++++---------
> 3 files changed, 38 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index b6e1377774e3..2920dfc9f58c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -676,6 +676,18 @@ void rcu_idle_enter(void)
> EXPORT_SYMBOL_GPL(rcu_idle_enter);
>
> #ifdef CONFIG_NO_HZ_FULL
> +
> +/*
> + * An empty function that will trigger a reschedule on
> + * IRQ tail once IRQs get re-enabled on userspace resume.
> + */
> +static void late_wakeup_func(struct irq_work *work)
> +{
> +}
> +
> +static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) =
> + IRQ_WORK_INIT(late_wakeup_func);
> +
> /**
> * rcu_user_enter - inform RCU that we are resuming userspace.
> *
> @@ -692,9 +704,17 @@ noinstr void rcu_user_enter(void)
> struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>
> lockdep_assert_irqs_disabled();
> - do_nocb_deferred_wakeup(rdp);
> + /*
> + * We may be past the last rescheduling opportunity in the entry code.
> + * Trigger a self IPI that will fire and reschedule once we resume to
> + * user/guest mode.
> + */
> + if (do_nocb_deferred_wakeup(rdp) && need_resched())
> + irq_work_queue(this_cpu_ptr(&late_wakeup_work));
> +
> rcu_eqs_enter(true);
> }
Do we have the guarantee that every architecture that supports NOHZ_FULL
has arch_irq_work_raise() on?
Also, can't you do the same thing you did earlier and do that wakeup
thing before we complete exit_to_user_mode_prepare() ?
Powered by blists - more mailing lists