lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160708104828.GE3808@pd.tnic>
Date:	Fri, 8 Jul 2016 12:48:28 +0200
From:	Borislav Petkov <bp@...en8.de>
To:	Ingo Molnar <mingo@...nel.org>,
	Yazen Ghannam <Yazen.Ghannam@....com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH 3/6] x86/mce: Add support for new MCA_SYND register

On Fri, Jul 08, 2016 at 12:26:48PM +0200, Ingo Molnar wrote:
> So is 'ECC syndrome' a fancy word and a complicated process for
> identifying what data got corrupted, in a more accurate fashion than
> what we had before?

The syndrome has always been there - even since K8 at least. This patch
is simply adding the change that on SMCA systems it should be read from
a different MSR.

The syndrome is part of the magic math behind Error Correction Codes
which can be used to point to which bits in the word in that memory
address were flipped.

OOOOh wait a minute!

I'm just getting the sickest idea:

@Yazen, is that SMCA syndrome max 16 bits on SMCA? Because if so - and I
would bet good money it is so - then we can stuff it into its old place
in the MCI_STATUS register part of struct mce, i.e. mce->status.

And then you won't need to touch the tracepoint and any of that.

Because you do:

	rdmsrl(MSR_AMD64_SMCA_MCx_SYND(bank), m.synd)

and I'll venture a good guess that that whole 64 bits MSR is not the
syndrome.

Right?

If I'm right, all those patches adding syndrome support need to be
reworked.

> Because previously we already had a memory address of the memory
> corruption, right?

We've always had the address and the syndrome. The syndrome is in
MCI_STATUS on older machines.

> What is the typical 'scope' of that memory corruption address - a
> cache line, a machine word, a byte or maybe a variable unit that is
> memory hardware dependent?

Typically 128 bit as the example above shows. The syndrome covers those
whole 128 bit. AFAIR(!), DRAM accesses are always done in 128 bit words
even if less is being read. All nicely hidden by the DRAM controller.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ