lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 1 Nov 2012 23:02:49 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	Mauro Carvalho Chehab <mchehab@...hat.com>,
	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report
 conflicts

On Thu, Nov 01, 2012 at 09:09:07PM +0000, Luck, Tony wrote:
> > That is correct, unfortunately. That information is not available to
> > software in all cases. Maybe APEI could be used for that DIMM location
> > mapping through simple tables instead of letting it fumble the error
> > handling path.
> 
> Not much hope for "simple"[1] tables.  There is also a timings issue on
> system with rank sparing, memory mirroring etc.  ... you need to decode
> to the DIMM at the time the error happened. If you wait until later, then
> the system may have switched over to the spare rank or mirror ... and then
> your decode will point at the new target, rather than the old.

Yeah, normally we're decoding the error right after being logged so...

> [1] Consider a 4 cpu-socket machine with 4 channels per socket and three
> DIMMs per channel - so there are 48 sockets on the motherboard. Then

You mean 48 DIMM slots, right?

> some lab monkey takes a box of random 1, 2, 4, 8 GB DIMMs and fills
> most of the sockets. BIOS will somehow make sense out of this and
> interleave where it finds matching speeds across pairs/quads of
> channels (though size need not match ... if you have a 2G and 4G DIMM
> you may get interleaving for the part. then non-interleaved for the
> "extra" 2G).

Right, but at least in the csrow case, we still can compute back the
csrow even with the interleaving, after we know how it is done exactly
(on which address bits, etc). I think this should be doable on Intel
controllers too but I don't know.

-- 
Regards/Gruss,
    Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ