[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210225123806.GA15006@hori.linux.bs1.fc.nec.co.jp>
Date: Thu, 25 Feb 2021 12:38:06 +0000
From: HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>
To: Oscar Salvador <osalvador@...e.de>
CC: Aili Yao <yaoaili@...gsoft.com>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"david@...hat.com" <david@...hat.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"bp@...en8.de" <bp@...en8.de>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"hpa@...or.com" <hpa@...or.com>, "x86@...nel.org" <x86@...nel.org>,
"inux-edac@...r.kernel.org" <inux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"yangfeng1@...gsoft.com" <yangfeng1@...gsoft.com>
Subject: Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned
On Thu, Feb 25, 2021 at 12:39:30PM +0100, Oscar Salvador wrote:
> On Thu, Feb 25, 2021 at 11:28:18AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > Hi Aili,
> >
> > I agree that this set_mce_nospec() is not expected to be called for
> > "already hwpoisoned" page because in the reported case the error
> > page is already contained and no need to resort changing cache mode.
>
> Out of curiosity, what is the current behavour now?
> Say we have an ongoing MCE which has marked the page as HWPoison but
> memory_failure did not take any action on the page yet.
> And then, we have another MCE, which ends up there.
> set_mce_nospec might clear _PAGE_PRESENT bit.
>
> Does that have any impact on the first MCE?
Hi Oscar,
Thank you for shedding light on this, this race looks worrisome to me.
We call try_to_unmap() inside memory_failure(), where we find affected
ptes by page_vma_mapped_walk() and convert into hwpoison entires in
try_to_unmap_one(). So there seems two racy cases:
1)
CPU 0 CPU 1
page_vma_mapped_walk
clear _PAGE_PRESENT bit
// skipped the entry
2)
CPU 0 CPU 1
page_vma_mapped_walk
try_to_unmap_one
clear _PAGE_PRESENT bit
convert the entry
set_pte_at
In case 1, the affected processes get signals on later access,
so although the info in SIGBUS could be different, that's OK.
And we have no impact in case 2.
Thanks,
Naoya Horiguchi
Powered by blists - more mailing lists