linux-kernel - Re: [PATCH v13 05/17] irq & spin_lock: Add counted interrupt disabling/enabling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aPHlySQJQpDmgHAm@tardis.local>
Date: Thu, 16 Oct 2025 23:44:25 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Bart Van Assche <bvanassche@....org>, Lyude Paul <lyude@...hat.com>,
	rust-for-linux@...r.kernel.org,	Thomas Gleixner <tglx@...utronix.de>,
 linux-kernel@...r.kernel.org,
	Daniel Almeida <daniel.almeida@...labora.com>,
	Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,	Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>,	Valentin Schneider <vschneid@...hat.com>,
	Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>,
	Miguel Ojeda <ojeda@...nel.org>,	Alex Gaynor <alex.gaynor@...il.com>,
 Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Benno Lossin <lossin@...nel.org>,
	Andreas Hindborg <a.hindborg@...nel.org>,
	Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
	Danilo Krummrich <dakr@...nel.org>,	David Woodhouse <dwmw@...zon.co.uk>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Joel Fernandes <joelagnelf@...dia.com>,
	Ryo Takakura <ryotkkr98@...il.com>,
	K Prateek Nayak <kprateek.nayak@....com>
Subject: Re: [PATCH v13 05/17] irq & spin_lock: Add counted interrupt
 disabling/enabling

On Thu, Oct 16, 2025 at 10:15:13AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 15, 2025 at 01:54:05PM -0700, Bart Van Assche wrote:
> 
> > > This also makes the wrapper of interrupt-disabling locks on Rust easier
> > > to design.
> > 
> > Is a new counter really required to fix the issues that exist in the
> > above examples? Has it been considered to remove the spin_lock_irqsave()
> > and spin_lock_irqrestore() definitions, to introduce a macro that saves
> > the interrupt state in a local variable and restores the interrupt state
> > from the same local variable? With this new macro, the first two examples
> > above would be changed into the following (this code has not been tested
> > in any way):
> 
> So the thing is that actually frobbing the hardware interrupt state is
> relatively expensive. On x86 things like PUSHF/POPF/CLI/STI are
> definitely on the 'nice to avoid' side of things.
> 
> Various people have written patches to avoid them on x86, and while none
> of them have made it, they did show benefit (notably PowerPC already
> does something tricky because for them it is *really* expensive).
> 
> So in that regard, keeping this counter allows us to strictly reduce the
> places where we have to touch IF. The interface is nicer too, so a win
> all-round.
> 
> My main objection to all this has been that they add to the interface
> instead of replace the interface. Ideally this would implement
> spin_lock_irq() and spin_lock_irqsave() in terms of the new
> spin_lock_irq_disable() and then we go do tree wide cleanups over a few
> cycles.
> 

Right, that would be the ideal case, however I did an experiment on
ARM64 trying to implement spin_lock_irq() with the new API (actually
it's implementing local_irq_disable() with the new API), here are my
findings:

1. At least in my test env, in start_kernel() we call
   local_irq_disable() while irq is already disabled, that means we
   expect unpaired local_irq_disable() + local_irq_enable().

2. My half-baked debugging tool found out we have code like:

    __pm_runtime_resume():
      spin_lock_irqsave();
      rpm_resume():
        rpm_callback():
          __rpm_callback():
            spin_unlock_irq();
            spin_lock_irq();
      spin_lock_irqrestore();

  this works if __pm_runtime_resume() never gets called while irq is
  already disabled, but if we were to implement spin_lock_irq() with the
  new API, it would be broken.

All in all, local_irq_disable(), local_irq_save() and the new API have
semantic-wise differences, while they behave almost the same if the
interrupt disabling scopes are properly nested, but we do have
"creative" usages: 1) shows we have code actually depends on unpaired
_disable() + _enable() and 2) shows we have "buggy" code that relys on
the semantic difference to work.

In an ideal world, we should find out all 1) and 2) and adjust that to
avoid a new interface, but I feel like, especially because of the
existence of 2), that is punishing the good code because of bad code ;-)
So adding the new API first, making it easy to use and difficult to
misuse and consolidating all APIs later seems more reasonable to me.

Regards,
Boqun

> The problem is that that requires non-trivial per architecture work and
> they've so far tried to avoid this...