lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkrZb6r1r6xFaEFvvJzwvVgDgeZWfjhq-SFu_mQZ0j5tTQ@mail.gmail.com>
Date:   Mon, 9 May 2022 13:34:09 -0700
From:   Yang Shi <shy828301@...il.com>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Song Liu <songliubraving@...com>,
        Rik van Riel <riel@...riel.com>,
        Matthew Wilcox <willy@...radead.org>, Zi Yan <ziy@...dia.com>,
        "Theodore Ts'o" <tytso@....edu>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux FS-devel Mailing List <linux-fsdevel@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [v3 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent

On Mon, May 9, 2022 at 9:05 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 4/4/22 22:02, Yang Shi wrote:
> >  include/linux/huge_mm.h        | 14 ++++++++++++
> >  include/linux/khugepaged.h     | 59 ++++++++++++---------------------------------------
> >  include/linux/sched/coredump.h |  3 ++-
> >  kernel/fork.c                  |  4 +---
> >  mm/huge_memory.c               | 15 ++++---------
> >  mm/khugepaged.c                | 76 +++++++++++++++++++++++++++++++++++++-----------------------------
> >  mm/mmap.c                      | 14 ++++++++----
> >  mm/shmem.c                     | 12 -----------
> >  8 files changed, 88 insertions(+), 109 deletions(-)
>
> Resending my general feedback from mm-commits thread to include the
> public ML's:
>
> There's modestly less lines in the end, some duplicate code removed,
> special casing in shmem.c removed, that's all good as it is. Also patch 8/8
> become quite boring in v3, no need to change individual filesystems and also
> no hook in fault path, just the common mmap path. So I would just handle
> patch 6 differently as I just replied to it, and acked the rest.
>
> That said it's still unfortunately rather a mess of functions that have
> similar names. transhuge_vma_enabled(vma). hugepage_vma_check(vma),
> transparent_hugepage_active(vma), transhuge_vma_suitable(vma, addr)?
> So maybe still some space for further cleanups. But the series is fine as it
> is so we don't have to wait for it now.

Yeah, I agree that we do have a lot thp checks. Will find some time to
look into it deeper later.

>
> We could also consider that the tracking of which mm is to be scanned is
> modelled after ksm which has its own madvise flag, but also no "always"
> mode. What if for THP we only tracked actual THP madvised mm's, and in
> "always" mode just scanned all vm's, would that allow ripping out some code
> perhaps, while not adding too many unnecessary scans? If some processes are

Do you mean add all mm(s) to the scan list unconditionally? I don't
think it will scale.

> being scanned without any effect, maybe track success separately, and scan
> them less frequently etc. That could be ultimately more efficinet than
> painfully tracking just *eligibility* for scanning in "always" mode?

Sounds like we need a couple of different lists, for example, inactive
and active? And promote or demote mm(s) between the two lists? TBH I
don't see too many benefits at the moment. Or I misunderstood you?

>
> Even more radical thing to consider (maybe that's a LSF/MM level topic, too
> bad :) is that we scan pagetables in ksm, khugepaged, numa balancing, soon
> in MGLRU, and I probably forgot something else. Maybe time to think about
> unifying those scanners?

We do have pagewalk (walk_page_range()) which is used by a couple of
mm stuff, for example, mlock, mempolicy, mprotect, etc. I'm not sure
whether it is feasible for khugepaged, ksm, etc, or not since I didn't
look that hard. But I agree it should be worth looking at.

>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ