lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F192F6672@ORSMSX104.amr.corp.intel.com>
Date:	Wed, 30 May 2012 23:24:41 +0000
From:	"Luck, Tony" <tony.luck@...el.com>
To:	Mauro Carvalho Chehab <mchehab@...hat.com>,
	Borislav Petkov <bp@...64.org>
CC:	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Aristeu Rozanski <arozansk@...hat.com>,
	Doug Thompson <norsk5@...oo.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...hat.com>
Subject: RE: [PATCH] RAS: Add a tracepoint for reporting memory controller
 events

>         u32 grain;              /* granularity of reported error in bytes */
> 				   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>> 			dimm->grain = nr_pages << PAGE_SHIFT;

I'm not at all sure what we'll see digging into the chipset registers
like EDAC does - but we do have different granularity when reporting
via machine check banks.  That's why we have this code:

                /*
                 * Mask the reported address by the reported granularity.
                 */
                if (mce_ser && (m->status & MCI_STATUS_MISCV)) {
                        u8 shift = MCI_MISC_ADDR_LSB(m->misc);
                        m->addr >>= shift;
                        m->addr <<= shift;
                }

in mce_read_aux().  In practice right now I think that many errors will
report with cache line granularity, while a few (IIRC patrol scrub) will
report with page (4K) granularity. Linux doesn't really care - they all
have to get rounded up to page size because we can't take away just one
cache line from a process.

> @Tony: Can you ensure us that, on Intel memory controllers, the address
> mask remains constant at module's lifetime, or are there any events that
> may change it (memory hot-plug, mirror mode changes, interleaving 
> reconfiguration, ...)?

I could see different controllers (or even different channels) having
different setup if you have a system with different size/speed/#ranks
DIMMs ... most systems today allow almost arbitrary mix & match, and the
BIOS will decide which interleave modes are possible based on what it
finds in the slots.  Mirroring imposes more constraints, so you will
see less crazy options. Hot plug for Linux reduces to just the hot add
case (as we still don't have a good way to remove DIMM sized chunks of
memory) ... so I don't see any clever reconfiguration possibilities
there (when you add memory, all the existing memory had better stay
where it is, preserving contents). Perhaps the only option where things
might change radically is socket migration ... where the constraint is
only that the target of the migration have >= memory of the source. So
you might move from some weird configuration with mixed DIMM sizes and
thus no interleave, to a homogeneous socket with matched DIMMs and full
interleave. But from an EDAC level, this is a new controller on a new
socket ... not a changed configuration on an existing socket.

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ