linux-kernel - Re: [PATCH 2/7] mm/khugepaged: stop swapping in page when VM_FAULT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHbLzkpc8ag7MkY_D17U1B7SjZFO2Bss8rVVj-scMOC8ttqxEg@mail.gmail.com>
Date:   Wed, 15 Jun 2022 10:49:30 -0700
From:   Yang Shi <shy828301@...il.com>
To:     Miaohe Lin <linmiaohe@...wei.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Howells <dhowells@...hat.com>, NeilBrown <neilb@...e.de>,
        Alistair Popple <apopple@...dia.com>,
        David Hildenbrand <david@...hat.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Peter Xu <peterx@...hat.com>, Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/7] mm/khugepaged: stop swapping in page when
 VM_FAULT_RETRY occurs

On Sat, Jun 11, 2022 at 1:47 AM Miaohe Lin <linmiaohe@...wei.com> wrote:
>
> When do_swap_page returns VM_FAULT_RETRY, we do not retry here and thus
> swap entry will remain in pagetable. This will result in later failure.
> So stop swapping in pages in this case to save cpu cycles.
>
> Signed-off-by: Miaohe Lin <linmiaohe@...wei.com>
> ---
>  mm/khugepaged.c | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 73570dfffcec..a8adb2d1e9c6 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1003,19 +1003,16 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
>                 swapped_in++;
>                 ret = do_swap_page(&vmf);
>
> -               /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */
> +               /*
> +                * do_swap_page returns VM_FAULT_RETRY with released mmap_lock.
> +                * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because
> +                * we do not retry here and swap entry will remain in pagetable
> +                * resulting in later failure.

Yeah, it makes sense.

> +                */
>                 if (ret & VM_FAULT_RETRY) {
>                         mmap_read_lock(mm);

A further optimization, you should not need to relock mmap_lock. You
may consider returning a different value or passing in *locked and
setting it to false, then check this value in the caller to skip
unlock.

> -                       if (hugepage_vma_revalidate(mm, haddr, &vma)) {
> -                               /* vma is no longer available, don't continue to swapin */
> -                               trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> -                               return false;
> -                       }
> -                       /* check if the pmd is still valid */
> -                       if (mm_find_pmd(mm, haddr) != pmd) {
> -                               trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> -                               return false;
> -                       }
> +                       trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> +                       return false;
>                 }
>                 if (ret & VM_FAULT_ERROR) {
>                         trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);

And I think "swapped_in++" needs to be moved after error handling.

> --
> 2.23.0
>
>