lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250815134259.GA27834@yaz-khff2.amd.com>
Date: Fri, 15 Aug 2025 09:42:59 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
	x86@...nel.org, avadhut.naik@....com, john.allen@....com
Subject: Re: [PATCH v2] x86/mce: Do away with unnecessary context quirks

On Thu, Aug 14, 2025 at 03:17:21PM -0700, Luck, Tony wrote:
> On Thu, Aug 14, 2025 at 05:07:30PM -0400, Yazen Ghannam wrote:
> > On Thu, Aug 14, 2025 at 12:52:19PM -0700, Luck, Tony wrote:
> > > But the first match nature of the table means that this rule hits
> > > (becauase neither or RIPV or EIPV is set):
> > > 
> > >         /* Neither return not error IP -- no chance to recover -> PANIC */
> > >         MCESEV(
> > >                 PANIC, "Neither restart nor error IP",
> > >                 EXCP, MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, 0)
> > >                 ),
> > > 
> > 
> > Thanks Tony. I see what you mean.
> > 
> > Do we really need this rule? It is essentially the same as the following
> > rule:
> > 
> > 	        MCESEV(
> > 			PANIC, "In kernel and no restart IP",
> > 		        EXCP, KERNEL, MCGMASK(MCG_STATUS_RIPV, 0)
> > 			),
> > 
> > ...since we assume "KERNEL" context if RIPV|EIPV are clear after
> > checking the CS register.
> 
> I'm not sure this could ever happen. But if it did, I think I'd like
> to see that message.
> > 
> > The message is not as explicit though. 
> > 
> > I did have an earlier idea that we introduce an "UNKNOWN" context for
> > the !pt_regs case.
> > 
> > We could add the "UNKNOWN" context to the "Neither restart nor error IP"
> > rule. That way it'll be skipped if we have a "USER" context and then it
> > should match the one you want.
> 
> I don't want to do that anywhere execpt that Sandybridge instruction
> fetch case (which wasn't classified as an erratum, because the h/w
> guys chose to set RIPV==0 and EIPV==0 ... but it was a poor choice.)
> 
> > Also, I just saw this in the Intel SDM:
> > 
> > "For the P6 family processors, if the EIPV flag in the MCG_STATUS MSR is
> > set, the saved contents of CS and EIP registers are directly associated
> > with the error that caused the machine-check exception to be generated;
> > if the flag is clear, the saved instruction pointer may not be associated
> > with the error (see Section 17.3.1.2, “IA32_MCG_STATUS MSR”)."
> > 
> > But I can't tell if this is true just for P6 or all, because the CS
> > register isn't referenced again with EIPV.
> 
> Should probably have said "P6 and newer". The intent of EIPV is to
> indicate that this machine check is because of something that happened
> on the current CPU (remember this bit was defined when all #MC on Intel
> were broadcast, so knowing which CPU(s) are involved, and which have
> just been pulled in to the #MC handler by the broadcast was very
> important.
> 

Okay, fair enough. It seems like these quirks should stay. Thanks for
the discussion. It really helped me better understand these quirks and
their history.

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ