linux-kernel - Re: [PATCH 2/7] mm/khugepaged: stop swapping in page when VM_FAULT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3ab39c38-eef5-502c-d290-d745aff7b0bd@huawei.com>
Date:   Thu, 16 Jun 2022 14:08:23 +0800
From:   Miaohe Lin <linmiaohe@...wei.com>
To:     Yang Shi <shy828301@...il.com>, Zach O'Keefe <zokeefe@...gle.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Howells <dhowells@...hat.com>, NeilBrown <neilb@...e.de>,
        Alistair Popple <apopple@...dia.com>,
        David Hildenbrand <david@...hat.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Peter Xu <peterx@...hat.com>, Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/7] mm/khugepaged: stop swapping in page when
 VM_FAULT_RETRY occurs

On 2022/6/16 1:51, Yang Shi wrote:
> On Wed, Jun 15, 2022 at 8:14 AM Zach O'Keefe <zokeefe@...gle.com> wrote:
>>
>> On 11 Jun 16:47, Miaohe Lin wrote:
>>> When do_swap_page returns VM_FAULT_RETRY, we do not retry here and thus
>>> swap entry will remain in pagetable. This will result in later failure.
>>> So stop swapping in pages in this case to save cpu cycles.
>>>
>>> Signed-off-by: Miaohe Lin <linmiaohe@...wei.com>
>>> ---
>>>  mm/khugepaged.c | 19 ++++++++-----------
>>>  1 file changed, 8 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 73570dfffcec..a8adb2d1e9c6 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -1003,19 +1003,16 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
>>>               swapped_in++;
>>>               ret = do_swap_page(&vmf);
>>>
>>> -             /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */
>>> +             /*
>>> +              * do_swap_page returns VM_FAULT_RETRY with released mmap_lock.
>>> +              * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because
>>> +              * we do not retry here and swap entry will remain in pagetable
>>> +              * resulting in later failure.
>>> +              */
>>>               if (ret & VM_FAULT_RETRY) {
>>>                       mmap_read_lock(mm);
>>> -                     if (hugepage_vma_revalidate(mm, haddr, &vma)) {
>>> -                             /* vma is no longer available, don't continue to swapin */
>>> -                             trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
>>> -                             return false;
>>> -                     }
>>> -                     /* check if the pmd is still valid */
>>> -                     if (mm_find_pmd(mm, haddr) != pmd) {
>>> -                             trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
>>> -                             return false;
>>> -                     }
>>> +                     trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
>>> +                     return false;
>>>               }
>>>               if (ret & VM_FAULT_ERROR) {
>>>                       trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
>>> --
>>> 2.23.0
>>>
>>>
>>
>> I've convinced myself this is correct, but don't understand how we got here.
>> AFAICT, we've always continued to fault in pages, and, as you mention, don't
>> retry ones that have failed with VM_FAULT_RETRY - so
>> __collapse_huge_page_isolate() should fail. I don't think (?) there is any
>> benefit to continuing to swap if we don't handle VM_FAULT_RETRY appropriately.
>>
>> So, I think this change looks good from that perspective. I suppose the only
>> other question would be: should we handle the VM_FAULT_RETRY case? Maybe 1
>> additional attempt then fail? AFAIK, this mostly (?) happens when the page is
>> locked.  Maybe it's not worth the extra complexity though..
> 
> It should be unnecessary for khugepaged IMHO since it will scan all
> the valid mm periodically, so it will come back eventually.

I tend to agree with Yang. Khugepaged will come back eventually so it's not
worth the extra complexity.

Thanks both!

> 
>>
> .
>