linux-kernel - Re: [PATCH v17 02/16] preempt: Track NMI nesting to separate per-CPU counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aYVXQVwOZapVbnP5@tardis.local>
Date: Thu, 5 Feb 2026 18:51:45 -0800
From: Boqun Feng <boqun@...nel.org>
To: Joel Fernandes <joelagnelf@...dia.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Lyude Paul <lyude@...hat.com>,
	rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Boqun Feng <boqun.feng@...il.com>,
	Daniel Almeida <daniel.almeida@...labora.com>,
	Miguel Ojeda <ojeda@...nel.org>,
	Alex Gaynor <alex.gaynor@...il.com>, Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Benno Lossin <lossin@...nel.org>,
	Andreas Hindborg <a.hindborg@...nel.org>,
	Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
	Danilo Krummrich <dakr@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
	Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v17 02/16] preempt: Track NMI nesting to separate per-CPU
 counter

On Thu, Feb 05, 2026 at 08:24:40PM -0500, Joel Fernandes wrote:
> 
> 
> On 2/5/2026 8:14 PM, Boqun Feng wrote:
> > On Thu, Feb 05, 2026 at 07:50:03PM -0500, Joel Fernandes wrote:
> >>
> >>
> >> On 2/5/2026 5:17 PM, Joel Fernandes wrote:
> >>>
> >>>
> >>> On 2/5/2026 4:40 PM, Boqun Feng wrote:
> >>>> On Wed, Feb 04, 2026 at 12:12:34PM +0100, Peter Zijlstra wrote:
> >>>>> On Tue, Feb 03, 2026 at 01:15:21PM +0100, Peter Zijlstra wrote:
> >>>>>> But I'm really somewhat sad that 64bit can't do better than this.
> >>>>>
> >>>>> Here, the below builds and boots (albeit with warnings because printf
> >>>>> format crap sucks).
> >>>>>
> >>>>
> >>>> Thanks! I will drop patch #1 and #2 and use this one (with a commit log
> >>>> and some more tests), given it's based on the work of Joel, Lyude and
> >>>> me, would the following tags make sense to all of you?
> >>>>> Co-developed-by: Joel Fernandes <joelagnelf@...dia.com>
> >>>
> >>> I don't know, I am not a big fan of the alternative patch because it adds a
> >>> per-cpu counter anyway if !CONFIG_PREEMPT_LONG [1]. And it is also a much bigger
> >>> patch than the one I wrote. Purely from an objective perspective, I would still
> >>> want to keep my original patch because it is simple. What is really the
> >>> objection to it?
> >>>
> > 
> > PREEMPT_LONG is an architecture-specific way to improve the performance
> > IMO. Just to be clear, do you object it at all, or do you object
> > combining it with your original patch? If it's the latter, I could make
> > another patch as a follow to enable PREEMPT_LONG.
> 
> When I looked at the alternative patch, I did consider that it was
> overcomplicated and it should be justified. Otherwise, I don't object to it. It

I don't think that's overcomplicated. Note that people have different
goals, for us (you, Lyude and me), we want to have a safer
interrupt-disabling lock API, hence this patchset. I think Peter on the
other hand while agreeing with us on the necessity, but wants to avoid
potential performance lost (maybe in general also likes the idea of
preempt_count being 64bit on 64bit machines ;-)) That patch looks
"overcomplicated" because it contains both goals (it actually contains
patch #1 and #2 along with the improvement). If you look them
separately, it would be not that complicated (Peter's diff against patch
1 + 2 will be relatively small).

> seems to be a matter of preference I think. I would prefer a simpler fix than an
> overcomplicated fix for a hypothetical issue (unless we have data showing
> issue). If it was a few lines of change, that'd be different story.
> 
> > 
> >>> [1]
> >>> +#ifndef CONFIG_PREEMPT_LONG
> >>> +/*
> >>> + * Any 32bit architecture that still cares about performance should
> >>> + * probably ensure this is near preempt_count.
> >>> + */
> >>> +DEFINE_PER_CPU(unsigned int, nmi_nesting);
> >>> +#endif
> >>>
> >> If the objection to my patch is modifying a per-cpu counter, isn't NMI a slow
> >> path? If we agree, then keeping things simple is better IMO unless we have data
> > 
> > I guess Peter was trying to say it's not a slow path if you consider
> > perf event interrupts on x86? [1]
> 
> How are we handling this performance issue then on 32-bit x86 architecture with
> perf? Or are we saying we don't care about performance on 32-bit?
> 

I'm not in the position to answer this (mostly for the second question).
Either we have data proving that the performance gap caused by your
original patch is small enough (if there is any) or it's up to x86
maintainers.

Regards,
Boqun

> -- 
> Joel Fernandes
>