linux-kernel - Re: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150615002500.GC4214@hori1.linux.bs1.fc.nec.co.jp>
Date:	Mon, 15 Jun 2015 00:25:00 +0000
From:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
To:	"Luck, Tony" <tony.luck@...el.com>
CC:	Xishi Qiu <qiuxishi@...wei.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"nao.horiguchi@...il.com" <nao.horiguchi@...il.com>,
	Yinghai Lu <yinghai@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"mingo@...e.hu" <mingo@...e.hu>, Xiexiuqi <xiexiuqi@...wei.com>,
	Hanjun Guo <guohanjun@...wei.com>,
	Linux MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 00/12] mm: mirrored memory support for page buddy
 allocations

On Fri, Jun 12, 2015 at 12:03:35PM -0700, Luck, Tony wrote:
> On Fri, Jun 12, 2015 at 08:42:33AM +0000, Naoya Horiguchi wrote:
> > 4?) I don't have the whole picture of how address ranging mirroring works,
> > but I'm curious about what happens when an uncorrected memory error happens
> > on the a mirror page. If HW/FW do some useful work invisible from kernel,
> > please document it somewhere. And my questions are:
> >  - can the kernel with this patchset really continue its operation without
> >    breaking consistency? More specifically, the corrupted page is replaced with
> >    its mirror page, but can any other pages which have references (like struct
> >    page or pfn) for the corrupted page properly switch these references to the
> >    mirror page? Or no worry about that?  (This is difficult for kernel pages
> >    like slab, and that's why currently hwpoison doesn't handle any kernel pages.)
> 
> The mirror is operated by h/w (perhaps with some platform firmware
> intervention when things start breaking badly).
> 
> In normal operation there are two DIMM addresses backing each
> system physical address in the mirrored range (thus total system
> memory capacity is reduced when mirror is enabled).  Memory writes
> are directed to both locations. Memory reads are interleaved to
> maintain bandwidth, so could come from either address.

I misunderstood that both of mirrored page and mirroring page are visible
to OS, which is incorrect.

> When a read returns with an ECC failure the h/w automatically:
>  1) Re-issues the read to the other DIMM address. If that also fails - then
>     we do the normal machine check processing for an uncorrected error
>  2) But if the other side of the mirror is good, we can send the good
>     data to the reader (cpu, or dma) and, in parallel try to fix the
>     bad side by writing the good data to it.
>  3) A corrected error will be logged, it may indicate whether the
>     attempt to fix succeeded or not.
>  4) If platform firmware wants, it can be notified of the correction
>     and it may keep statistics on the rate of errors, correction status,
>     etc.  If things get very bad it may "break" the mirror and direct
>     all future reads to the remaining "good" side. If does this it will
>     likely tell the OS via some ACPI method.

Thanks, this fully answered my question. 

> All of this is done at much less than page granularity. Cache coherence
> is maintained ... apart from some small performance glitches and the corrected
> error logs, the OS is unware of all of this.
> 
> Note that in current implementations the mirror copies are both behind
> the same memory controller ... so this isn't intended to cope with high
> level failure of a memory controller ... just to deal with randomly
> distributed ECC errors.

OK, I looked at "Memory Address Range Mirroring Validation Guide" and Fig 2-2
clearly shows that.

> >  - How can we test/confirm that the whole scheme works fine?  Is current memory
> >    error injection framework enough?
> 
> Still working on that piece. To validate you need to be able to
> inject errors to just one side of the mirror, and I'm not really
> sure that the ACPI/EINJ interface is up to the task.

OK.

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/