lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <584eedd3-9369-9df1-39e2-62e331abdcc0@bytedance.com>
Date:   Sun, 5 Jun 2022 12:24:24 +0800
From:   zhenwei pi <pizhenwei@...edance.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     naoya.horiguchi@....com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
        Wu Fengguang <fengguang.wu@...el.com>
Subject: Re: Re: [PATCH] mm/memory-failure: don't allow to unpoison hw
 corrupted page



On 6/5/22 02:56, Andrew Morton wrote:
> On Sat,  4 Jun 2022 18:32:29 +0800 zhenwei pi <pizhenwei@...edance.com> wrote:
> 
>> Currently unpoison_memory(unsigned long pfn) is designed for soft
>> poison(hwpoison-inject) only. Unpoisoning a hardware corrupted page
>> puts page back buddy only, this leads BUG during accessing on the
>> corrupted KPTE.
>>
>> Do not allow to unpoison hardware corrupted page in unpoison_memory()
>> to avoid BUG like this:
>>
>>   Unpoison: Software-unpoisoned page 0x61234
>>   BUG: unable to handle page fault for address: ffff888061234000
> 
> Thanks.
> 
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -2090,6 +2090,7 @@ int unpoison_memory(unsigned long pfn)
>>   {
>>   	struct page *page;
>>   	struct page *p;
>> +	pte_t *kpte;
>>   	int ret = -EBUSY;
>>   	int freeit = 0;
>>   	static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
>> @@ -2101,6 +2102,13 @@ int unpoison_memory(unsigned long pfn)
>>   	p = pfn_to_page(pfn);
>>   	page = compound_head(p);
>>   
>> +	kpte = virt_to_kpte((unsigned long)page_to_virt(p));
>> +	if (kpte && !pte_present(*kpte)) {
>> +		unpoison_pr_info("Unpoison: Page was hardware poisoned %#lx\n",
>> +				 pfn, &unpoison_rs);
>> +		return -EPERM;
>> +	}
>> +
>>   	mutex_lock(&mf_mutex);
>>   
>>   	if (!PageHWPoison(p)) {
> 
> I guess we don't want to let fault injection crash the kernel, so a
> cc:stable seems appropriate here.
> 
> Can we think up a suitable Fixes: commit?  I'm suspecting this bug has
> been there for a long time?
> 

Sure!

2009-Dec-16, hwpoison_unpoison() was introduced into linux in commit:
847ce401df392("HWPOISON: Add unpoisoning support")
...
There is no hardware level unpoisioning, so this cannot be used for real 
memory errors, only for software injected errors.
...

We can find that this function should be used for software level 
unpoisoning only in both commit log and comment in source code. 
unfortunately there is no check in function hwpoison_unpoison().


2020-May-20, 17fae1294ad9d("x86/{mce,mm}: Unmap the entire page if the 
whole page is affected and poisoned")

This clears KPTE, and leads BUG(described in this patch) during 
unpoisoning the hardware corrupted page.


Fixes: 847ce401df392("HWPOISON: Add unpoisoning support")
Fixes: 17fae1294ad9d("x86/{mce,mm}: Unmap the entire page if the whole 
page is affected and poisoned")

Cc: Wu Fengguang <fengguang.wu@...el.com>
Cc: Tony Luck <tony.luck@...el.com>.

-- 
zhenwei pi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ