lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1401245361.5049.6.camel@cyc>
Date:	Wed, 28 May 2014 10:49:21 +0800
From:	Chen Yucong <slaoub@...il.com>
To:	Borislav Petkov <bp@...en8.de>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	linux-edac <linux-edac@...r.kernel.org>, X86 ML <x86@...nel.org>,
	Tony Luck <tony.luck@...el.com>
Subject: Re: [RFC PATCH 0/3] RAS: Correctable Errors Collector thing


> From: Borislav Petkov <bp@...e.de>
> 
> Hi all,
> 
> this is something Tony and I have been working on behind the curtains
> recently. Here it is in a RFC form, it passes quick testing in kvm. Let
> me send it out before I start hammering on it on a real machine.
> 
> More indepth info about what it is and what it does is in patch 1/3.
> 
> As always, comments and suggestions are most welcome.
> 
> Thanks.

What's the point of this patch set?
My understanding is that if there are some(COUNT_MASK) corrected DRAM
ECC errors for a specific page frame, we can believe that the page frame
is so ill that it should be isolated as soon as possible.

The question is: memory_failure can not be used for isolating the page
frame which is being used by kernel, because it just poison the page and
IGNORED. memory_failure is mostly used for handling AR/AO type errors
related to the page frame which the userspace tasks are using now.

Although the relative page frame is very ill, it is not dead and can
still work. However, memory_failure may kill the userspace tasks,
especially for those page frames that are holding dynamic data rather
than file-backed(file/swap) data.

So I do not think that it is a good idea to directly use memory_failure
in this patch set. 

thx!
cyc


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ