lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f629820c-50cf-7366-975e-68215b3f2bc5@amd.com>
Date:   Tue, 9 May 2023 10:25:09 -0400
From:   Yazen Ghannam <yazen.ghannam@....com>
To:     Shuai Xue <xueshuai@...ux.alibaba.com>, bp@...en8.de,
        tony.luck@...el.com
Cc:     yazen.ghannam@....com, tglx@...utronix.de, mingo@...hat.com,
        dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
        baolin.wang@...ux.alibaba.com, linux-edac@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/mce/amd: init mce severity to handle deferred memory
 failure

On 4/25/23 8:18 AM, Shuai Xue wrote:
> When a deferred UE error is detected, e.g by background patrol scruber, it
> will be handled in APIC interrupt handler amd_deferred_error_interrupt().
> The handler will collect MCA banks, init mce struct and process it by
> nofitying the registered MCE decode chain.
> 
> The uc_decode_notifier, one of MCE decode chain, will process memory
> failure but only limit to MCE_AO_SEVERITY and MCE_DEFERRED_SEVERITY.
> However, APIC interrupt handler does not init mce severity and the
> uninitialized severity is 0 (MCE_NO_SEVERITY).
> 
> To handle the deferred memory failure case, init mce severity when logging
> MCA banks.
> 
> Signed-off-by: Shuai Xue <xueshuai@...ux.alibaba.com>
>

Hi Shuai Xue,

I think this patch is fair to do. But it won't have the intended effect
in practice.

The value in MCA_ADDR for DRAM ECC errors will be a memory controller
"normalized address". This is not a system physical address that the OS
can use to take action.

The mce_usable_address() function needs to be updated to handle this.
I'll send a patchset this week to do so. Afterwards, the
uc_decode_notifier will not attempt to handle these errors.

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ