linux-kernel - Re: [RFC Patch 0/2] mm: Add parameters to make kernel behavior at memory error on dirty cache selectable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <516E3383.5060105@gmail.com>
Date:	Wed, 17 Apr 2013 13:30:43 +0800
From:	Simon Jeons <simon.jeons@...il.com>
To:	Andi Kleen <andi@...stfloor.org>
CC:	Mitsuhiro Tanino <mitsuhiro.tanino.gm@...achi.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>
Subject: Re: [RFC Patch 0/2] mm: Add parameters to make kernel behavior at
 memory error on dirty cache selectable

On 04/11/2013 09:49 PM, Andi Kleen wrote:
>> As a result, if the dirty cache includes user data, the data is lost,
>> and data corruption occurs if an application uses old data.

Hi Andi,

Could you give me the link of your mce testcase?

> The application cannot use old data, the kernel code kills it if it
> would do that. And if it's IO data there is an EIO triggered.
>
> iirc the only concern in the past was that the application may miss
> the asynchronous EIO because it's cleared on any fd access.
>
> This is a general problem not specific to memory error handling,
> as these asynchronous IO errors can happen due to other reason
> (bad disk etc.)
>
> If you're really concerned about this case I think the solution
> is to make the EIO more sticky so that there is a higher chance
> than it gets returned.  This will make your data much more safe,
> as it will cover all kinds of IO errors, not just the obscure memory
> errors.
>
> Or maybe have a panic knob on any IO error for any case if you don't
> trust your application to check IO syscalls. But I would rather
> have better EIO reporting than just giving up like this.
>
> The problem of tying it just to any dirty data for memory errors
> is that most anonymous data is dirty and it doesn't have this problem
> at all (because the signals handle this and they cannot be lost)
>
> And that is a far more common case than this relatively unlikely
> case of dirty IO data.
>
> So just doing it for "dirty" is not the right knob.
>
> Basically I'm saying if you worry about unreliable IO error reporting
> fix IO error reporting, don't add random unnecessary panics to
> the memory error handling.
>
> BTW my suspicion is that if you approach this from a data driven
> perspective: that is measure how much such dirty data is typically
> around in comparison to other data it will be unlikely. Such
> a study can be done with the "page-types" program in tools/vm
>
> -Andi
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/