linux-kernel - Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e3e3ae0f-50f6-6b13-c520-26aac353e0cb@huawei.com>
Date:   Mon, 28 Mar 2022 10:14:05 +0800
From:   Miaohe Lin <linmiaohe@...wei.com>
To:     Rik van Riel <riel@...riel.com>
CC:     <linux-mm@...ck.org>, <kernel-team@...com>,
        Oscar Salvador <osalvador@...e.de>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Mel Gorman <mgorman@...e.de>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        <stable@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation

On 2022/3/27 4:14, Rik van Riel wrote:
> On Sat, 2022-03-26 at 15:48 +0800, Miaohe Lin wrote:
>> On 2022/3/26 4:14, Rik van Riel wrote:
>>>
>>> +++ b/mm/memory.c
>>> @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct
>>> vm_fault *vmf)
>>>                 return ret;
>>>  
>>>         if (unlikely(PageHWPoison(vmf->page))) {
>>> +               struct page *page = vmf->page;
>>>                 vm_fault_t poisonret = VM_FAULT_HWPOISON;
>>>                 if (ret & VM_FAULT_LOCKED) {
>>> +                       if (page_mapped(page))
>>> +                               unmap_mapping_pages(page_mapping(pa
>>> ge),
>>> +                                                   page->index, 1,
>>> false);
>>
>> It seems this unmap_mapping_pages also helps the success rate of the
>> below invalidate_inode_page.
>>
> 
> That is indeed what it is supposed to do.
> 
> It isn't fool proof, since you can still end up
> with dirty pages that don't get cleaned immediately,
> but it seems to turn infinite loops of a program
> being killed every time it's started into a more
> manageable situation where the task succeeds again
> pretty quickly.

Looks convincing to me.

> 
>>>                         /* Retry if a clean page was removed from
>>> the cache. */
>>> -                       if (invalidate_inode_page(vmf->page))
>>> -                               poisonret = 0;
>>> -                       unlock_page(vmf->page);
>>> +                       if (invalidate_inode_page(page))
>>> +                               poisonret = VM_FAULT_NOPAGE;
>>> +                       unlock_page(page);
>>>                 }
>>> -               put_page(vmf->page);
>>> +               put_page(page);
>>
>> Do we use page instead of vmf->page just for simplicity? Or there is
>> some other concern?
>>
> 
> Just a simplification, and not dereferencing the same thing
> 6 times.
> 

I see. :)

>>>                 vmf->page = NULL;
>>
>> We return either VM_FAULT_NOPAGE or VM_FAULT_HWPOISON with vmf->page
>> = NULL. If any case,
>> finish_fault won't be called later. So I think your fix is right.
> 
> Want to send in a Reviewed-by or Acked-by? :)
> 

Sure, but when I think more about this, it seems this fix isn't ideal:
If VM_FAULT_NOPAGE is returned with page table unset, the process will
re-trigger page fault again and again until invalidate_inode_page succeeds
to evict the inode page. This might hang the process a really long time.
Or am I miss something?

Thanks.