linux-kernel - Re: spurious mce Hardware Error messages in next-20250912

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250918210005.GA2150610@yaz-khff2.amd.com>
Date: Thu, 18 Sep 2025 17:00:05 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Nikolay Borisov <nik.borisov@...e.com>
Cc: Bert Karwatzki <spasswolf@....de>, Borislav Petkov <bp@...en8.de>,
	Tony Luck <tony.luck@...el.com>, linux-kernel@...r.kernel.org,
	linux-next@...r.kernel.org, linux-edac@...r.kernel.org,
	linux-acpi@...r.kernel.org, x86@...nel.org, rafael@...nel.org,
	qiuxu.zhuo@...el.com, Smita.KoralahalliChannabasappa@....com
Subject: Re: spurious mce Hardware Error messages in next-20250912

On Thu, Sep 18, 2025 at 01:20:58PM +0300, Nikolay Borisov wrote:
> 
> 
> On 9/17/25 22:26, Yazen Ghannam wrote:
> <snip>
> 
> 
> > Right, so it seems we have bogus data logged in these registers. And
> > this is unrelated to the recent patches.
> > 
> > We have some combination of bits set in MCA_DESTAT registers. The
> > deferred error interrupt hasn't fired (at least from the latest
> > example).
> > 
> > There does seem to be some combination of bits that are always set and
> > others flip between examples.
> > 
> > I'll highlight this to our hardware folks. But I don't think there's
> > much we can do other than filter these out somehow.
> > 
> > I can add two checks to the patch to make it more like the current
> > behavior.
> > 
> > 1) Check for 'Deferred' status bit when logging from the MCA_DESTAT.
> > This was in the debug patch I shared.
> 
> According to AMD APM 9.3.3.4:
> 
> "If the error being logged is a deferred error, then the error will be
> logged to MCA_DESTAT."
> 
> So this means that when Valid is set in DESTAT then the error MUST BE
> deferred. I.e I think it's in valid to have valid && !deferred in DESTAT, no
> ?

Yes, correct. That is why this issue is perplexing.

> 
> Additionally nowhere in the APM is ti mentioned what's the default value of
> MCA_CONFIG.LogDeferredEn so as it stands you are now working with the
> assumption that it's 1 and DESTAT is always a redundant copy of STATUS.
> 

The value is determined by the platform. The Linux code is structured so
the data is gathered from any possible source. That's why there are a
few checks to determine which register to look at.

> Btw looking at the output that Bert has provided it seems that indeed
> MCA_CONFIG.LogDeferredEn is 0 by default:

Banks 9 to 14 seem to have bogus values. And this seems to be the cause
of our mishandling here.

You can see the difference compared to the other banks. Banks 7 and 8
are good comparisons as they are of the same "type" (L3 cache).

> 
> "
> LogDeferredEn—Bit 34. Enable logging of deferred errors in MCA_STATUS. 0=Log
> deferred errors only in MCA_DESTAT and MCA_DEADDR. 1=Log deferred errors in
> MCA_STATUS and MCA_ADDR in addition to MCA_DESTAT and MCA_DEADDR. This bit
> does not affect logging of deferred errors in MCA_SYND or MCA_MISCx.
> "
> 
> 
> I think the polling code is slightly broken now for AMD. The order of
> operation per poll cycle should be:
> 
> 1. Check MCA_STATUS -> report if there is anything, clear it the bank
> 2. (In the same cycle) -> Check DEFERRED and report if there is anything,
> clear the deferred.
>

It is unlikely to have two independent errors in MCA_STATUS and
MCA_DESTAT due to how errors can be overwritten by more severe errors.
By default, our reference platform implementation has
MCA_CONFIG[LogDeferredInMcaStat] enabled. So a deferred error in
MCA_STATUS will only be overwritten by an uncorrectable (#MC) error. In
this case, MCA_STATUS will be cleared by the #MC handler. And so
MCA_DESTAT acts as a backup.

But you're right there is a gap here that we can try to fill if a
platform ever changes this config bit.

For the current issue, it does seem that the registers contain junk
values. And we are only now seeing this with the recent rework.

Bert, can you please provide two more register dumps from the script?

Our hardware team is interested to see if the values remain consistent
or change between reads.

Thanks,
Yazen