lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6eb3c5f4-2198-d501-7320-ea6209a63465@linux.alibaba.com>
Date:   Wed, 2 Nov 2022 19:53:44 +0800
From:   Shuai Xue <xueshuai@...ux.alibaba.com>
To:     "Luck, Tony" <tony.luck@...el.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     "lenb@...nel.org" <lenb@...nel.org>,
        "james.morse@....com" <james.morse@....com>,
        "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "naoya.horiguchi@....com" <naoya.horiguchi@....com>,
        "linmiaohe@...wei.com" <linmiaohe@...wei.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "cuibixuan@...ux.alibaba.com" <cuibixuan@...ux.alibaba.com>,
        "baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
        "zhuo.song@...ux.alibaba.com" <zhuo.song@...ux.alibaba.com>
Subject: Re: [PATCH] ACPI: APEI: set memory failure flags as
 MF_ACTION_REQUIRED on action required events



在 2022/10/29 AM1:25, Luck, Tony 写道:
>>> cper_sec_mem_err::error_type identifies the type of error that occurred
>>> if CPER_MEM_VALID_ERROR_TYPE is set. So, set memory failure flags as 0
>>> for Scrub Uncorrected Error (type 14). Otherwise, set memory failure
>>> flags as MF_ACTION_REQUIRED.
> 
> On x86 the "action required" cases are signaled by a synchronous machine check
> that is delivered before the instruction that is attempting to consume the uncorrected
> data retires. I.e., it is guaranteed that the uncorrected error has not been propagated
> because it is not visible in any architectural state.

On arm, if a 2-bit (uncorrectable) error is detected, and the memory access has been
architecturally executed, that error is considered “consumed”. The CPU will take a
synchronous error exception, signaled as synchronous external abort (SEA), which is
analogously to MCE.

> 
> APEI signaled errors don't fall into that category on x86 ... the uncorrected data
> could have been consumed and propagated long before the signaling used for
> APEI can alert the OS.
> 
> Does ARM deliver APEI signals synchronously?
> 
> If not, then this patch might deliver a false sense of security to applications
> about the state of uncorrected data in the system.
> 

Well, it does not always. There are many APEI notification, such as SCI, GSIV, GPIO,
SDEI, SEA, etc. Not all APEI notifications are synchronously and it depends on
hardware signal. As far as I know, if a UE is detected and consumed, synchronous external
abort is signaled to firmware and firmware then performs a first-level triage and
synchronously notify OS by SDEI or SEA notification. On the other hand, if CE is
detected, a asynchronous interrupt will be signaled and firmware could notify OS
by GPIO or GSIV.

Best Regards,
Shuai


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ