lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f5d171fd-8fc6-cd39-1467-45ce29517f47@arm.com>
Date:   Tue, 25 Aug 2020 15:51:47 +0100
From:   James Morse <james.morse@....com>
To:     Liguang Zhang <zhangliguang@...ux.alibaba.com>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] arm64: traps: clean up arm64_ras_serror_get_severity()

Hi Zhang,

On 12/08/2020 12:09, Liguang Zhang wrote:
> Function arm64_is_fatal_ras_serror() is always called after
> arm64_is_ras_serror(), so we should remove some needless
> arm64_is_ras_serror() call in function arm64_ras_serror_get_severity().

> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> index cee5928e1b7d..287b4d64dc67 100644
> --- a/arch/arm64/include/asm/traps.h
> +++ b/arch/arm64/include/asm/traps.h
> @@ -79,13 +79,6 @@ static inline bool arm64_is_ras_serror(u32 esr)
>   */
>  static inline u32 arm64_ras_serror_get_severity(u32 esr)
>  {
> -	u32 aet = esr & ESR_ELx_AET;
> -
> -	if (!arm64_is_ras_serror(esr)) {
> -		/* Not a RAS error, we can't interpret the ESR. */
> -		return ESR_ELx_AET_UC;
> -	}

I agree this can go, it looks like I had it here as a sanity check while the KVM bits were
sorted out.

Please also remove the comment that says it does this:
| * Non-RAS SError's are reported as Uncontained/Uncategorized.

This becomes the callers problem.


>  	/*
>  	 * AET is RES0 if 'the value returned in the DFSC field is not
>  	 * [ESR_ELx_FSC_SERROR]'

> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 13ebd5ca2070..635d4cca0a4b 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -913,7 +913,7 @@ bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr)
>  	case ESR_ELx_AET_UC:	/* Uncontainable or Uncategorized error */
>  	default:
>  		/* Error has been silently propagated */
> -		arm64_serror_panic(regs, esr);
> +		return true;

KVM depends on this, please don't remove it.

What does 'fatal' mean?
To the arch code it means panic(), as we don't (yet) have the information to fix the
error. But to KVM 'fatal' means kill-the-guest. KVM does this as without user-space's
involvement, there is very little else it can do.

KVM can only do this if the error is contained. As it must have been contained by stage2,
so the host can keep running. But if the error was reported as uncontained, KVM would need
to panic() the host.

(An example of an Uncontained error is a store that went to the wrong address due to
corruption that wasn't caught in time. We can't trust any value in memory once we've seen
one of these.)


I agree it looks funny, but it was simpler for the arch code helper to do this, instead of
having an 'arm64_is_uncontained_ras_serror(), as now you'd always have to check three things.


>  	}
>  }


Thanks,

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ