linux-kernel - Re: [PATCH 1/5] riscv: misaligned: factorize trap handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250422094419.GC14170@noisy.programming.kicks-ass.net>
Date: Tue, 22 Apr 2025 11:44:19 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Clément Léger <cleger@...osinc.com>
Cc: Alexandre Ghiti <alex@...ti.fr>,
	"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
	open list <linux-kernel@...r.kernel.org>,
	"open list:RISC-V ARCHITECTURE" <linux-riscv@...ts.infradead.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@...r.kernel.org>,
	Jonathan Corbet <corbet@....net>,
	Paul Walmsley <paul.walmsley@...ive.com>,
	Palmer Dabbelt <palmer@...belt.com>,
	Albert Ou <aou@...s.berkeley.edu>, Shuah Khan <shuah@...nel.org>,
	Andrew Jones <ajones@...tanamicro.com>,
	Samuel Holland <samuel.holland@...ive.com>
Subject: Re: [PATCH 1/5] riscv: misaligned: factorize trap handling

On Tue, Apr 22, 2025 at 09:57:12AM +0200, Clément Léger wrote:
> 
> 
> On 21/04/2025 09:06, Alexandre Ghiti wrote:
> > Hi Clément,
> > 
> > 
> > On 14/04/2025 14:34, Clément Léger wrote:
> >> misaligned accesses traps are not nmi and should be treated as normal
> >> one using irqentry_enter()/exit().
> > 
> > 
> > All the traps that come from kernel mode are treated as nmi as it was
> > suggested by Peter here: https://lore.kernel.org/linux-riscv/
> > Yyhv4UUXuSfvMOw+@...ez.programming.kicks-ass.net/
> > 
> > I don't know the differences between irq_nmi_entry/exit() and irq_entry/
> > exit(), so is that still correct to now treat the kernel traps as non-nmi?
> 
> Hi Alex,
> 
> Actually, this discussion was raised on a previous series [1] by Maciej
> which replied that we should actually reenable interrupt depending on
> the state that was interrupted. Looking at other architecture/code, it
> seems like treating misaligned accesses as NMI is probably not the right
> way. For instance, loongarch treats them as normal IRQ using a
> irqentry_enter()/exit() and reenabling IRQS if possible.

So, a trap that happens in kernel space while IRQs are disabled, SHOULD
really be NMI-like.

You then have a choice, make all such traps from kernel space NMI-like;
this makes it easy on the trap handler, since the context is always the
same. Mistakes are 'easy' to find.

Or,.. do funny stuff and only make it NMI like if IRQs were disabled.
Which gives inconsistent context for the handler and you'll find
yourself scratching your head at some point in the future wondering why
this one rare occasion goes BOOM.

x86 mostly does the first, any trap that can happen with IRQs disabled
is treated unconditionally as NMI like. The obvious exception is
page-fault, but that already has a from-non-preemptible-context branch
that is 'careful'.

As to unaligned traps from kernel space, I would imagine they mostly BUG
the kernel, except when there's an exception entry for that location, in
which case it might do a fixup?

Anyway, the reason these exceptions should be NMI like, is because
interrupts are not allowed to nest. Notably something like:

  raw_spin_lock_irqsave(&foo);
  <IRQ>
    raw_spin_lock_irqsave(&foo);
    ...

Is an obvious problem. Exceptions that can run while IRQs are disabled,
must not use locks -- treating them as NMI-like (they are non-maskable
after all), ensures this.