[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5028D2B0.4010800@ce.jp.nec.com>
Date: Mon, 13 Aug 2012 19:10:56 +0900
From: "Jun'ichi Nomura" <j-nomura@...jp.nec.com>
To: Andi Kleen <andi@...stfloor.org>
CC: Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Andi Kleen <andi.kleen@...el.com>,
Wu Fengguang <fengguang.wu@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Tony Luck <tony.luck@...el.com>,
Rik van Riel <riel@...hat.com>,
Naoya Horiguchi <nhoriguc@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache
On 08/11/12 08:09, Andi Kleen wrote:
> Naoya Horiguchi <n-horiguchi@...jp.nec.com> writes:
>
>> Current memory error handling on dirty pagecache has a bug that user
>> processes who use corrupted pages via read() or write() can't be aware
>> of the memory error and result in discarding dirty data silently.
>>
>> The following patch is to improve handling/reporting memory errors on
>> this case, but as a short term solution I suggest that we should undo
>> the present error handling code and just leave errors for such cases
>> (which expect the 2nd MCE to panic the system) to ensure data consistency.
>
> Not sure that's the right approach. It's not worse than any other IO
> errors isn't it?
IMO, it's worse in certain cases. For example, producer-consumer type
program which uses file as a temporary storage.
Current memory-failure.c drops produced data from dirty pagecache
and allows reader to consume old or empty data from disk (silently!),
that's what I think HWPOISON should prevent.
Similar thing could happen theoretically with disk I/O errors,
though, practically those errors are often persistent and reader will
likely get errors again instead of bad data.
Also, ext3/ext4 has an option to panic when an error is detected,
for people who want to avoid corruption on intermittent errors.
--
Jun'ichi Nomura, NEC Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists