[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210924225954.GN880162@paulmck-ThinkPad-P17-Gen-1>
Date: Fri, 24 Sep 2021 15:59:54 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Mark Rutland <mark.rutland@....com>
Cc: Pingfan Liu <kernelfans@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-arm-kernel@...ts.infradead.org,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Marc Zyngier <maz@...nel.org>,
Joey Gouly <joey.gouly@....com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Julien Thierry <julien.thierry@....com>,
Yuichi Ito <ito-yuichi@...itsu.com>,
linux-kernel@...r.kernel.org, Sven Schnelle <svens@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>
Subject: Re: [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of
On Fri, Sep 24, 2021 at 06:36:15PM +0100, Mark Rutland wrote:
> [Adding Paul for RCU, s390 folk for entry code RCU semantics]
>
> On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> > After introducing arm64/kernel/entry_common.c which is akin to
> > kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> > the following:
> > enter_from_kernel_mode()->rcu_irq_enter().
> > And
> > gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
> >
> > Besides redundance, based on code analysis, the redundance also raise
> > some mistake, e.g. rcu_data->dynticks_nmi_nesting inc 2, which causes
> > rcu_is_cpu_rrupt_from_idle() unexpected.
>
> Hmmm...
>
> The fundamental questionss are:
>
> 1) Who is supposed to be responsible for doing the rcu entry/exit?
>
> 2) Is it supposed to matter if this happens multiple times?
>
> For (1), I'd generally expect that this is supposed to happen in the
> arch/common entry code, since that itself (or the irqchip driver) could
> depend on RCU, and if that's the case thatn handle_domain_irq()
> shouldn't need to call rcu_irq_enter(). That would be consistent with
> the way we handle all other exceptions.
>
> For (2) I don't know whether the level of nesting is suppoosed to
> matter. I was under the impression it wasn't meant to matter in general,
> so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
> specific level of nesting.
>
> >From a glance it looks like this would cause rcu_sched_clock_irq() to
> skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
> doesn't sound right, at least...
>
> Thomas, Paul, thoughts?
It is absolutely required that rcu_irq_enter() and rcu_irq_exit() calls
be balanced. Normally, this is taken care of by the fact that irq_enter()
invokes rcu_irq_enter() and irq_exit() invokes rcu_irq_exit(). Similarly,
nmi_enter() invokes rcu_nmi_enter() and nmi_exit() invokes rcu_nmi_exit().
But if you are doing some special-case exception where the handler needs
to use RCU readers, but where the rest of the work is not needed, then
the resulting calls to rcu_irq_enter() and rcu_irq_exit() must be in
the architecture-specific code and must be properly balanced.
So if exception entry invokes rcu_irq_enter() twice, then exception
exit also needs to invoke rcu_irq_exit() twice.
There are some constraints on where calls to these functions are place.
For example, any exception-entry code prior to the call to rcu_irq_enter()
must consist solely of functions marked noinstr, but Thomas can tell
you more.
Or am I missing the point of your question?
Thanx, Paul
> AFAICT, s390 will have a similar flow on its IRQ handling path, so if
> this is a real issue they'll be affected too.
>
> Thanks,
> Mark.
>
> > Nmi also faces duplicate accounts. This series aims to address these
> > duplicate issues.
> > [1-2/5]: address nmi account duplicate
> > [3-4/5]: address rcu housekeeping duplicate in irq
> > [5/5]: as a natural result of [3-4/5], address a history issue. [1]
> >
> >
> > History:
> > v1 -> v2:
> > change the subject as the motivation varies.
> > add the fix for nmi account duplicate
> >
> > The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> > makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
> >
> > There have been some tries to enable crash-stop-NMI on arm64, one by me,
> > the other by Yuichi's [4]. I hope after this series, they can advance,
> > as Marc said in [3] "No additional NMI patches will make it until we
> > have resolved the issues"
> >
> > [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
> > [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@gmail.com
> > [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@kernel.org
> > [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@fujitsu.com
> >
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Mark Rutland <mark.rutland@....com>
> > Cc: Marc Zyngier <maz@...nel.org>
> > Cc: Joey Gouly <joey.gouly@....com>
> > Cc: Sami Tolvanen <samitolvanen@...gle.com>
> > Cc: Julien Thierry <julien.thierry@....com>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: Yuichi Ito <ito-yuichi@...itsu.com>
> > Cc: linux-kernel@...r.kernel.org
> > To: linux-arm-kernel@...ts.infradead.org
> >
> >
> > Pingfan Liu (5):
> > arm64/entry-common: push the judgement of nmi ahead
> > irqchip/GICv3: expose handle_nmi() directly
> > kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
> > optional
> > irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
> > irqchip/GICv3: make reschedule-ipi light weight
> >
> > arch/arm64/Kconfig | 1 +
> > arch/arm64/include/asm/irq.h | 7 ++++
> > arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
> > arch/arm64/kernel/irq.c | 29 ++++++++++++++
> > drivers/irqchip/irq-gic-v3.c | 66 ++++++++++++++++++++------------
> > kernel/irq/Kconfig | 3 ++
> > kernel/irq/irqdesc.c | 4 ++
> > 7 files changed, 116 insertions(+), 39 deletions(-)
> >
> > --
> > 2.31.1
> >
Powered by blists - more mailing lists