lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 2 Feb 2024 21:36:27 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: Tong Tiangen <tongtiangen@...wei.com>, Thomas Gleixner
	<tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
	"wangkefeng.wang@...wei.com" <wangkefeng.wang@...wei.com>, Dave Hansen
	<dave.hansen@...ux.intel.com>, "x86@...nel.org" <x86@...nel.org>, "H. Peter
 Anvin" <hpa@...or.com>, Andy Lutomirski <luto@...nel.org>, Peter Zijlstra
	<peterz@...radead.org>, Andrew Morton <akpm@...ux-foundation.org>, "Naoya
 Horiguchi" <naoya.horiguchi@....com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-edac@...r.kernel.org"
	<linux-edac@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
	Guohanjun <guohanjun@...wei.com>
Subject: RE: [PATCH -next v4 2/3] x86/mce: rename MCE_IN_KERNEL_COPYIN to
 MCE_IN_KERNEL_COPY_MC

> > At least on Intel you can only get a machine check for operation on poison data LOAD.
> > Not for a STORE. I believe that is generally true - other arches to confirm.
>
> So what happens if you store to a poisoned cacheline on Intel? It'll
> raise a poison consumption error when that cacheline is loaded in the
> cache? Because you need to load that line into the cache for writing,
> I'd presume...

There are two places in the pipeline where poison is significant.

1) When the memory controller gets a request to fetch some data. If the ECC
check on the bits returned from the DIMMs the memory controller will log
a "UCNA" signature error to a machine check bank for the memory channel
where the DIMMs live. If CMCI is enabled for that bank, then a CMCI is
sent to all logical CPUs that are in the scope of that bank (generally a
CPU socket). The data is marked with a POISON signature and passed
to the entity that requested it. Caches support this POISON signature
and preserve it as data is moved between caches, or written back to
memory. This may have been a prefetch or a speculative read. In these
cases there won't be a machine check. Linux uc_decode_notifier() will
try to offline pages when it sees UCNA signatures.

2) When a CPU core tries to retire an instruction that consumes poison
data, or needs to retire a poisoned instruction. These log an SRAR signature
into a core scoped bank (on most Xeons to date bank 0 for poisoned instructions,
bank 1 for poisoned data consumption). Then they signal a machine check.

> What happens if you have bits flipped in the cacheline you want to write
> to?
>
> That's fine because you're overwriting them anyway?
>
> I'd presume ECC check gets performed on cacheline load and then you'll
> have to raise an #MC...

Partial cacheline stores to data marked as POISON in the cache maintain
the poison status. Full cacheline writes (certainly with MOVDIR64B instruction,
possibly with some AVX512 instructions) can clear the POISON status (since
you have all new data). A sequence of partial cache line stores that overwrite
all data in a cache line will NOT clear the POISON status.

Nothing is logged or signaled when updating data in the cache.

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ