[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf911628-71be-0dca-f1c7-c12e681bd37f@arm.com>
Date: Thu, 13 Oct 2016 14:00:30 +0100
From: Suzuki K Poulose <Suzuki.Poulose@....com>
To: Tyler Baicar <tbaicar@...eaurora.org>, christoffer.dall@...aro.org,
marc.zyngier@....com, pbonzini@...hat.com, rkrcmar@...hat.com,
linux@...linux.org.uk, catalin.marinas@....com,
will.deacon@....com, rjw@...ysocki.net, lenb@...nel.org,
matt@...eblueprint.co.uk, robert.moore@...el.com,
lv.zheng@...el.com, mark.rutland@....com, james.morse@....com,
akpm@...ux-foundation.org, sandeepa.s.prabhu@...il.com,
shijie.huang@....com, paul.gortmaker@...driver.com,
tomasz.nowicki@...aro.org, fu.wei@...aro.org, rostedt@...dmis.org,
bristot@...hat.com, linux-arm-kernel@...ts.infradead.org,
kvmarm@...ts.cs.columbia.edu, Dkvm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org,
linux-efi@...r.kernel.org, devel@...ica.org
Cc: "Jonathan (Zhixiong) Zhang" <zjzhang@...eaurora.org>
Subject: Re: [PATCH V3 06/10] acpi: apei: panic OS with fatal error status
block
On 07/10/16 22:31, Tyler Baicar wrote:
> From: "Jonathan (Zhixiong) Zhang" <zjzhang@...eaurora.org>
>
> Even if an error status block's severity is fatal, the kernel does not
> honor the severity level and panic.
>
> With the firmware first model, the platform could inform the OS about a
> fatal hardware error through the non-NMI GHES notification type. The OS
> should panic when a hardware error record is received with this
> severity.
>
> Call panic() after CPER data in error status block is printed if
> severity is fatal, before each error section is handled.
>
> Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@...eaurora.org>
> ---
> drivers/acpi/apei/ghes.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 28d5a09..36894c8 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -141,6 +141,8 @@ static unsigned long ghes_estatus_pool_size_request;
> static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
> static atomic_t ghes_estatus_cache_alloced;
>
> +static int ghes_panic_timeout __read_mostly = 30;
> +
> static int ghes_ioremap_init(void)
> {
> ghes_ioremap_area = __get_vm_area(PAGE_SIZE * GHES_IOREMAP_PAGES,
> @@ -715,6 +717,12 @@ static int ghes_proc(struct ghes *ghes)
> if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
> ghes_estatus_cache_add(ghes->generic, ghes->estatus);
> }
> + if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
> + if (panic_timeout == 0)
> + panic_timeout = ghes_panic_timeout;
> + panic("Fatal hardware error!");
I think there is a chance that we might miss the o/p of ghes_print_estatus() as we use
no pfx, and it could default to the normal loglevel and would never get printed
if panic() is encountered before it. On the other hand, there is already a
__ghes_panic() which does similar stuff. Is there a way we could reuse
(may be even parts of) it ? Or at least use KERN_EMERG for the ghes_print_estatus(),
if the severity could result in panic() ?
Cheers
Suzuki
Powered by blists - more mailing lists