lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ikryq9ik.ffs@tglx>
Date: Thu, 05 Dec 2024 19:17:55 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Waiman Long <llong@...hat.com>, Waiman Long <llong@...hat.com>, Ingo
 Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
 <dave.hansen@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, "H. Peter Anvin"
 <hpa@...or.com>
Subject: Re: [PATCH v2] x86/nmi: Add an emergency handler in nmi_desc & use
 it in nmi_shootdown_cpus()

On Thu, Dec 05 2024 at 08:22, Waiman Long wrote:
> On 12/5/24 8:12 AM, Thomas Gleixner wrote:
>>> Actually, crash_nmi_callback() can return in the case of the crashing
>>> CPUs, though all the other CPUs will not return once called. So I
>>> believe the current form is correct. I will update the comment to
>>> reflect that.
>> Why would you continue servicing the NMI on a CPU which just crashed?
>
> According to crash_nmi_callback(),
>
>          /*
>           * Don't do anything if this handler is invoked on crashing cpu.
>           * Otherwise, system will completely hang. Crashing cpu can get
>           * an NMI if system was initially booted with nmi_watchdog 
> parameter.
>           */
>          if (cpu == crashing_cpu)
>                  return NMI_HANDLED;
>
> The crashing CPU still has work to do after shutting down other CPUs. It 
> can't wait there forever without completing other crashing actions. The 
> only thing I can see we can do is to return immediately without 
> servicing other less important nmi handlers in the list.

I understand that, but in case that the crashed CPU receives an NMI and
sees that the emergency handler is set, shouldn't it stop the NMI
processing instead of trying to go through perf and what not when the
system is already in a fragile state. i.e.:

       if (emergemcy_handler) {
          emergency_handler();
          return;
       }

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ