lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 29 Oct 2018 16:43:29 +0000
From:   Russell King - ARM Linux <linux@...linux.org.uk>
To:     Mark Rutland <mark.rutland@....com>
Cc:     "Wiebe, Wladislav (Nokia - DE/Ulm)" <wladislav.wiebe@...ia.com>,
        "tony@...mide.com" <tony@...mide.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "ebiederm@...ssion.com" <ebiederm@...ssion.com>,
        "jrdr.linux@...il.com" <jrdr.linux@...il.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] arm: mm: fault: check ADFSR in case of abort

On Mon, Oct 29, 2018 at 03:54:36PM +0000, Mark Rutland wrote:
> On Mon, Oct 29, 2018 at 02:20:51PM +0000, Wiebe, Wladislav (Nokia - DE/Ulm) wrote:
> > When running into situations like:
> > "Unhandled fault: synchronous external abort (0x210) at 0xXXX"
> > or
> > "Unhandled prefetch abort: synchronous external abort (0x210) at 0xXXX"
> > it is useful to know the content of ADFSR (Auxiliary Data Fault Status
> > Register) to indicate an ECC double-bit error in L1 or L2 cache.
> > 
> > Refer to:
> > Cortex-A15 Technical Reference Manual, Revision: r2p1
> > [6.4.8. Error Correction Code]
> > 
> > Signed-off-by: Wladislav Wiebe <wladislav.wiebe@...ia.com>
> > ---
> >  arch/arm/mm/fault.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> > index 3232afb6fdc0..5e240deb6ed6 100644
> > --- a/arch/arm/mm/fault.c
> > +++ b/arch/arm/mm/fault.c
> > @@ -547,6 +547,22 @@ hook_fault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *)
> >  	fsr_info[nr].name = name;
> >  }
> >  
> > +/*
> > + * Check for ECC double-bit errors in Auxiliary Data Fault Status Register
> > + */
> > +static void check_adfsr_for_ecc(void)
> > +{
> > +	u32 adfsr = 0;
> > +
> > +	asm("mrc p15, 0, %0, c5, c1, 0" : "=r" (adfsr));
> > +
> > +	if (adfsr & (BIT(31) | BIT(23))) {
> > +		pr_alert("ADFSR status 0x%x indicates that an L1 or L2 cache\n"
> > +			 "ECC double-bit error occurred at some time.\n",
> > +			  adfsr);
> > +	}
> > +}
> > +
> >  /*
> >   * Dispatch a data abort to the relevant handler.
> >   */
> > @@ -559,6 +575,7 @@ do_DataAbort(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
> >  	if (!inf->fn(addr, fsr & ~FSR_LNX_PF, regs))
> >  		return;
> >  
> > +	check_adfsr_for_ecc();
> >  	pr_alert("Unhandled fault: %s (0x%03x) at 0x%08lx\n",
> >  		inf->name, fsr, addr);
> >  	show_pte(current->mm, addr);
> > @@ -593,6 +610,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)
> >  	if (!inf->fn(addr, ifsr | FSR_LNX_PF, regs))
> >  		return;
> >  
> > +	check_adfsr_for_ecc();
> >  	pr_alert("Unhandled prefetch abort: %s (0x%03x) at 0x%08lx\n",
> >  		inf->name, ifsr, addr);
> 
> IIUC at this point the task is preemptible (and interruptible),

It may be preemptable, but isn't necessarily so.  It depends whether the
called FSR specific function enabled interrupts or not.

So, it would be better to read the ADFSR before calling the FSR specific
function to guarantee that we read the values that correspond with _this_
fault.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ