[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB6083664BCFC8047A5FE8F6A9FC22A@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Thu, 22 Jun 2023 15:35:27 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Yazen Ghannam <yazen.ghannam@....com>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>
Subject: RE: [PATCH 1/2] x86/mce: Disable preemption for CPER decoding
> All the above is done when the BERT is processed during late init. This
> can be scheduled on any CPU, and it may be preemptible.
> 2) mce_setup() will pull info from the executing CPU, so some info in
> struct mce may be incorrect for the CPU with the error. For example,
> in a dual-socket system, an error logged in socket 1 CPU but
> processed by a socket 0 CPU will save the PPIN of the socket 0 CPU.
> Fix the first issue by locally disabling preemption before calling
> mce_setup().
It doesn't really fix the issue, it just makes the warnings go away.
The BERT record was created because some error crashed the
system. It's being parsed by a CPU that likely had nothing
to do with the actual error that occurred in the previous incarnation
of the OS.
If there is a CPER record in the BERT data that includes CPU
information, that would be the right thing to use. Alternatively
is there some invalid CPU value that could be loaded into the
"struct mce"?
-Tony
Powered by blists - more mailing lists