linux-kernel - Re: #PF from NMI

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ima8luix.fsf@nanos.tec.linutronix.de>
Date:   Sat, 14 Nov 2020 00:13:58 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Peter Zijlstra <peterz@...radead.org>,
        Paul McKenney <paulmck@...nel.org>, eupm90@...il.com
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Andy Lutomirski <luto@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: #PF from NMI

On Fri, Nov 13 2020 at 13:53, Peter Zijlstra wrote:
> [  139.226724] WARNING: CPU: 9 PID: 2290 at kernel/rcu/tree.c:932 __rcu_irq_enter_check_tick+0x84/0xd0
> [  139.226753]  irqentry_enter+0x25/0x40
> [  139.226753]  exc_page_fault+0x38/0x4c0
> [  139.226753]  asm_exc_page_fault+0x1e/0x30

...

> [  139.226757]  perf_callchain_user+0xf4/0x280
>
> Which is a #PF from NMI context, which is perfectly fine. However
> __rcu_irq_enter_check_tick() is triggering WARN.
>
> AFAICT the right thing is to simply remove the warn like so.
>
> ---
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 430ba58d8bfe..9bda92d8b914 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -928,8 +928,8 @@ void __rcu_irq_enter_check_tick(void)
>  {
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  
> -	 // Enabling the tick is unsafe in NMI handlers.
> -	if (WARN_ON_ONCE(in_nmi()))
> +	// if we're here from NMI, there's nothing to do.
> +	if (in_nmi())
>  		return;
>  
>  	RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),

Yes. That's right.

To answer Pauls question:

> But is a corresponding change required on return-from-NMI side?
> Looks OK to me at first glance, but I could be missing something.

No. The corresponding issue is not return from NMI. The corresponding
problem is the return from the page fault handler, but there is nothing
to worry about. That part is NMI safe already.

And Luto's as well:

> with the following caveat that has nothing to do with NMI: the rest of
> irqentry_enter() has tracing calls in it. Does anything prevent
> infinite recursion if one of those tracing calls causes a page fault?

nmi:
  ...
  trace_hardirqs_off_finish() {
    if (!this_cpu_read(tracing_irq_cpu)) {
           this_cpu_write(tracing_irq_cpu, 1);
           ...
  }
  ...
  perf()

#PF
  save_cr2()
  
  irqentry_enter()
     trace_hardirqs_off_finish()
        if (!this_cpu_read(tracing_irq_cpu)) {

So yes, it is recursion protected unless I'm missing something.

Thanks,

        tglx