lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f03b6c61-1669-c03e-310c-cc1364cf30a8@amd.com>
Date:   Thu, 22 Jun 2023 12:24:15 -0400
From:   Yazen Ghannam <yazen.ghannam@....com>
To:     "Luck, Tony" <tony.luck@...el.com>,
        "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>
Cc:     yazen.ghannam@....com,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH 1/2] x86/mce: Disable preemption for CPER decoding

On 6/22/2023 11:35 AM, Luck, Tony wrote:
>> All the above is done when the BERT is processed during late init. This
>> can be scheduled on any CPU, and it may be preemptible.
> 
>> 2) mce_setup() will pull info from the executing CPU, so some info in
>>    struct mce may be incorrect for the CPU with the error. For example,
>>    in a dual-socket system, an error logged in socket 1 CPU but
>>    processed by a socket 0 CPU will save the PPIN of the socket 0 CPU.
> 
>> Fix the first issue by locally disabling preemption before calling
>> mce_setup().
> 
> It doesn't really fix the issue, it just makes the warnings go away.
> 
> The BERT record was created because some error crashed the
> system. It's being parsed by a CPU that likely had nothing
> to do with the actual error that occurred in the previous incarnation
> of the OS.
>

Yes, these are true statements.

> If there is a CPER record in the BERT data that includes CPU
> information, that would be the right thing to use. Alternatively
> is there some invalid CPU value that could be loaded into the
> "struct mce"?
> 

This is the reason we search for the logical CPU number using the Local 
APIC ID provided in the CPER. And fill in relevant data using that CPU 
number.

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ