linux-kernel - Re: [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YWEXPIIeMgSAuSBf@piliu.users.ipa.redhat.com>
Date:   Sat, 9 Oct 2021 12:14:52 +0800
From:   Pingfan Liu <kernelfans@...il.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Mark Rutland <mark.rutland@....com>,
        linux-arm-kernel@...ts.infradead.org,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, Marc Zyngier <maz@...nel.org>,
        Joey Gouly <joey.gouly@....com>,
        Sami Tolvanen <samitolvanen@...gle.com>,
        Julien Thierry <julien.thierry@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Yuichi Ito <ito-yuichi@...itsu.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead

On Fri, Oct 08, 2021 at 08:45:23AM -0700, Paul E. McKenney wrote:
> On Fri, Oct 08, 2021 at 12:01:25PM +0800, Pingfan Liu wrote:
> > Sorry that I missed this message and I am just back from a long
> > festival.
> > 
> > Adding Paul for RCU guidance.
> 
> Didn't the recent patch series cover this, or is this a new problem?
> 
Sorry not to explain it clearly. This is a new problem.

The acked recent series derive from [3-4/5], which addresses the nested calling:
in a single normal interrupt handler
    rcu_irq_enter()
        rcu_irq_enter()
	...
        rcu_irq_exit()
    rcu_irq_exit()


While this new problem [1-2/5] is about pNMI (similar to NMI in this context).
On arm64, the current process in a pNMI handler looks like:
    rcu_irq_enter(){ rcu_nmi_enter()}
        ^^^ At this point, the handler is treated as a normal interrupt temporary, (no chance to __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);).
	    So rcu_nmi_enter() can not distinguish NMI, because "if (!in_nmi())" can not tell it. (goto "questionA")
        nmi_enter()
	NMI handler
	nmi_exit()
    rcu_irq_exit()

[...]
> > Refer to rcu_nmi_enter(), which can be called by
> > enter_from_kernel_mode():
> > 
> > ||noinstr void rcu_nmi_enter(void)
> > ||{
> > ||        ...
> > ||        if (rcu_dynticks_curr_cpu_in_eqs()) {
> > ||
> > ||                if (!in_nmi())
> > ||                        rcu_dynticks_task_exit();
> > ||
> > ||                // RCU is not watching here ...
> > ||                rcu_dynticks_eqs_exit();
> > ||                // ... but is watching here.
> > ||
> > ||                if (!in_nmi()) {
> > ||                        instrumentation_begin();
> > ||                        rcu_cleanup_after_idle();
> > ||                        instrumentation_end();
> > ||                }
> > ||
> > ||                instrumentation_begin();
> > ||                // instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
> > ||                instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks));
> > ||                // instrumentation for the noinstr rcu_dynticks_eqs_exit()
> > ||                instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
> > ||
> > ||                incby = 1;
> > ||        } else if (!in_nmi()) {
> > ||                instrumentation_begin();
> > ||                rcu_irq_enter_check_tick();
> > ||        } else  {
> > ||                instrumentation_begin();
> > ||        }
> > ||        ...
> > ||}
> > 
> > There is 3 pieces of code put under the
> > protection of if (!in_nmi()). At least the last one
> > "rcu_irq_enter_check_tick()" can trigger a hard lock up bug. Because it
> > is supposed to hold a spin lock with irqoff by
> > "raw_spin_lock_rcu_node(rdp->mynode)", but pNMI can breach it. The same
> > scenario in rcu_nmi_exit()->rcu_prepare_for_idle().
> > 

questionA:
> > As for the first two "if (!in_nmi())", I have no idea of why, except
> > breaching spin_lock_irq() by NMI. Hope Paul can give some guide.
> > 

Thanks,

	Pingfan