[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e928b6a2-2bb4-82dc-1508-5b293ecb7539@huawei.com>
Date: Fri, 8 Apr 2022 11:31:05 +0800
From: Miaohe Lin <linmiaohe@...wei.com>
To: HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>
CC: Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Yang Shi <shy828301@...il.com>,
Dan Carpenter <dan.carpenter@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH v7] mm/hwpoison: fix race between hugetlb free/demotion
and memory_failure_hugetlb()
On 2022/4/8 9:56, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Thu, Apr 07, 2022 at 09:38:26PM +0800, Miaohe Lin wrote:
>> On 2022/4/7 19:29, Naoya Horiguchi wrote:
> ...
>>> +int __get_huge_page_for_hwpoison(unsigned long pfn, int flags)
>>> +{
>>> + struct page *page = pfn_to_page(pfn);
>>> + struct page *head = compound_head(page);
>>> + int ret = 2; /* fallback to normal page handling */
>>> + bool count_increased = false;
>>> +
>>> + if (!PageHeadHuge(head))
>>> + goto out;
>>> +
>>> + if (flags & MF_COUNT_INCREASED) {
>>> + ret = 1;
>>> + count_increased = true;
>>> + } else if (HPageFreed(head) || HPageMigratable(head)) {
>>> + ret = get_page_unless_zero(head);
>>> + if (ret)
>>> + count_increased = true;
>>> + } else {
>>> + ret = -EBUSY;
>>> + goto out;
>>> + }
>>> +
>>> + if (hwpoison_filter(page)) {
>>> + ret = -EOPNOTSUPP;
>>> + goto out;
>>> + }
>>
>> Now hwpoison_filter is done without lock_page + unlock_page. Is this ok or
>> lock_page + unlock_page pair is indeed required?
>
> Hmm, we had better call hwpoison_filter in page lock for hugepages.
> I'll move this too, thank you.
>
>>> +
>>> + if (TestSetPageHWPoison(head)) {
>>> + ret = -EHWPOISON;
>>> + goto out;
>>> + }
>>
>> Without this patch, page refcnt is not decremented if MF_COUNT_INCREASED is set in flags
>> when PageHWPoison is already set. So I think this patch also fixes that issue. Thanks!
>
> Good point, I even didn't notice that. And the issue still seems to exist
> for normal page's cases. Maybe encountering "already hwpoisoned" case from
> madvise_inject_error() is rare but could happen when the first call failed
> to contain the error (which is still accessible from the calling process).
Oh, I missed normal page's issue. :) Will you fix this issue kindly or am I supposed
to fix it?
Many thanks.
>
...
Powered by blists - more mailing lists