[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dswbcfrlrrhikpjjrr2aluipb7qn4stfmrept27nr2b4egeg3x@pgt73zwmfm7x>
Date: Tue, 15 Jul 2025 09:09:32 -0700
From: Breno Leitao <leitao@...ian.org>
To: Will Deacon <will@...nel.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
osandov@...com, leo.yan@....com, rmikey@...a.com
Subject: Re: [PATCH] arm64: traps: Mark kernel as tainted on SError panic
On Tue, Jul 15, 2025 at 03:02:13PM +0100, Will Deacon wrote:
> On Mon, Jul 14, 2025 at 05:26:43AM -0700, Breno Leitao wrote:
> > On Sun, Jul 13, 2025 at 11:46:06PM +0100, Will Deacon wrote:
> > > On Thu, Jul 10, 2025 at 03:46:35AM -0700, Breno Leitao wrote:
> >
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -931,6 +931,7 @@ void __noreturn panic_bad_stack(struct pt_regs *regs, unsigned long esr, unsigne
> > > >
> > > > void __noreturn arm64_serror_panic(struct pt_regs *regs, unsigned long esr)
> > > > {
> > > > + add_taint(TAINT_MACHINE_CHECK, LOCKDEP_STILL_OK);
> > > > console_verbose();
> > > >
> > > > pr_crit("SError Interrupt on CPU%d, code 0x%016lx -- %s\n",
> > >
> > > If we're going to taint for SError, shouldn't we also taint for an
> > > unclaimed SEA?
> >
> > Yes. I was not very familiar with SEA errors, given I haven't seen on in
> > production yet, but, reading about it, that is another seems to crash
> > the system due to hardware errors, thus, we want to taint MACHINE_CHECK.
> >
> > What about this?
> >
> > Author: Breno Leitao <leitao@...ian.org>
> > Date: Mon Jul 14 05:16:55 2025 -0700
> >
> > arm64: Taint kernel on fatal hardware error in do_sea()
> >
> > This patch updates the do_sea() handler to taint the kernel with
> > TAINT_MACHINE_CHECK when a fatal hardware error is detected and
> > reported through Synchronous External Abort (SEA). By marking
> > the kernel as tainted at the point of error, we improve
> > post-mortem diagnostics and make it clear that a machine check
> > or unrecoverable hardware fault has occurred.
> >
> > Suggested-by: Will Deacon <will@...nel.org>
> > Signed-off-by: Breno Leitao <leitao@...ian.org>
> >
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index 11eb8d1adc84..f590dc71ce99 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -838,6 +838,7 @@ static int do_sea(unsigned long far, unsigned long esr, struct pt_regs *regs)
> > */
> > siaddr = untagged_addr(far);
> > }
> > + add_taint(TAINT_MACHINE_CHECK, LOCKDEP_STILL_OK);
> > arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr);
> >
> > return 0;
>
> Yeah, I reckon so. Probably just fold these into a single patch, though.
Thanks. I test it better tomorrow, then send it.
Thanks for the suggestions,
--breno
Powered by blists - more mailing lists