lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 7 Apr 2017 20:37:03 +0000 From: "Ghannam, Yazen" <Yazen.Ghannam@....com> To: Borislav Petkov <bp@...en8.de> CC: "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>, Tony Luck <tony.luck@...el.com>, "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org> Subject: RE: [PATCH v2 1/2] x86/mce/AMD: Redo use of SMCA MCA_DE{STAT,ADDR} registers > -----Original Message----- > From: Borislav Petkov [mailto:bp@...en8.de] > Sent: Wednesday, April 05, 2017 4:05 PM > To: Ghannam, Yazen <Yazen.Ghannam@....com> > Cc: linux-edac@...r.kernel.org; Tony Luck <tony.luck@...el.com>; > x86@...nel.org; linux-kernel@...r.kernel.org > Subject: Re: [PATCH v2 1/2] x86/mce/AMD: Redo use of SMCA > MCA_DE{STAT,ADDR} registers > > On Wed, Apr 05, 2017 at 07:29:48PM +0000, Ghannam, Yazen wrote: > > If it's set, then I expect a Deferred error in MCA_STATUS since any > > Correctable Errors will be overwritten. Multiple bank types can > > generate Deferred errors so there may also be cases where for some > > types a valid Uncorrectable error happens and overwrites the Deferred > > error before we can handle it. In which case we lose the Deferred error if > we don't check MCA_DESTAT. > > So if we have an UE, wouldn't that raise an #MC? I guess in such cases we > should concentrate only on the deferred errors and let the #MC handler deal > with them. As we do now. > Right. > > If it's not set, then it's possible to have a valid Correctable error > > in MCA_STATUS while the valid Deferred error is in MCA_DESTAT. > > What's logging the CE? We probably should log it too before something > overwrites it. > CEs will get picked up by polling or if we hit a threshold. > Anyway, ok, I think I know what needs to happen now: > > amd_deferred_error_interrupt: > > if (__log_error_deferred(bank)) > return; > > This one read MC?_STATUS and does the logging for when the deferred error > is in the normal MSRs. It returns true if it succeeded. It reads and hands down > both MC?_STATUS and MC?_ADDR to __log_error() so that it doesn't have to > read MC?_STATUS twice. > Okay. But do we need a return value? I'm thinking we can go through all banks and log any and all Deferred errors rather than just the first one we find. I asked and this is the preferred method like how we do in the #MC handler. The same applies to the thresholding interrupt handler. > If __log_error_deferred() has read a different type of error, we still log it? I'm > not sure about this. I guess we can ignore that case for now. > Yeah, we should ignore other error types. They'll get picked up by other methods. > Then: > > if (mca_flags.smca) > __log_error_deferred_smca(bank)); > > which handles the SMCA case. It too reads MSR_AMD64_SMCA_MCx_DESTAT > and MSR_AMD64_SMCA_MCx_DEADDR and hands them down to > __log_error() for logging. > Can this just be included in the __log_error_deferred()? I'm thinking something similar to one of your earlier suggestions. Like this: static void __log_error_deferred(unsigned int bank) { u64 status, addr; bool logged = false; rdmsrl(msr_ops.status, status); if (is_deferred_error(status)) { rdmsrl(msr_ops.addr, addr); __log_error(bank, status, addr, 0); wrmsrl(msr_ops.status, 0); logged = true; } if (!mca_flags.smca) return; if (logged) { wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT, 0); return; } rdmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), status); if (is_deferred_error(status)) { rdmsrl(MSR_AMD64_SMCA_MCx_DEADDR, addr); __log_error(bank, status, addr, 0); wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT, 0); } } > For the __log_error() call in amd_threshold_interrupt(), you define a > log_error() wrapper which reads the default MSRs and hands them down to > __log_error(). > Okay. > So __log_error() always gets STATUS and ADDR MSR values and it doesn't > need to read them from the MSRs but only log them. > Okay. Thanks, Yazen
Powered by blists - more mailing lists