[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkrZb6r1r6xFaEFvvJzwvVgDgeZWfjhq-SFu_mQZ0j5tTQ@mail.gmail.com>
Date: Mon, 9 May 2022 13:34:09 -0700
From: Yang Shi <shy828301@...il.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Song Liu <songliubraving@...com>,
Rik van Riel <riel@...riel.com>,
Matthew Wilcox <willy@...radead.org>, Zi Yan <ziy@...dia.com>,
"Theodore Ts'o" <tytso@....edu>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux MM <linux-mm@...ck.org>,
Linux FS-devel Mailing List <linux-fsdevel@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [v3 PATCH 0/8] Make khugepaged collapse readonly FS THP more consistent
On Mon, May 9, 2022 at 9:05 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 4/4/22 22:02, Yang Shi wrote:
> > include/linux/huge_mm.h | 14 ++++++++++++
> > include/linux/khugepaged.h | 59 ++++++++++++---------------------------------------
> > include/linux/sched/coredump.h | 3 ++-
> > kernel/fork.c | 4 +---
> > mm/huge_memory.c | 15 ++++---------
> > mm/khugepaged.c | 76 +++++++++++++++++++++++++++++++++++++-----------------------------
> > mm/mmap.c | 14 ++++++++----
> > mm/shmem.c | 12 -----------
> > 8 files changed, 88 insertions(+), 109 deletions(-)
>
> Resending my general feedback from mm-commits thread to include the
> public ML's:
>
> There's modestly less lines in the end, some duplicate code removed,
> special casing in shmem.c removed, that's all good as it is. Also patch 8/8
> become quite boring in v3, no need to change individual filesystems and also
> no hook in fault path, just the common mmap path. So I would just handle
> patch 6 differently as I just replied to it, and acked the rest.
>
> That said it's still unfortunately rather a mess of functions that have
> similar names. transhuge_vma_enabled(vma). hugepage_vma_check(vma),
> transparent_hugepage_active(vma), transhuge_vma_suitable(vma, addr)?
> So maybe still some space for further cleanups. But the series is fine as it
> is so we don't have to wait for it now.
Yeah, I agree that we do have a lot thp checks. Will find some time to
look into it deeper later.
>
> We could also consider that the tracking of which mm is to be scanned is
> modelled after ksm which has its own madvise flag, but also no "always"
> mode. What if for THP we only tracked actual THP madvised mm's, and in
> "always" mode just scanned all vm's, would that allow ripping out some code
> perhaps, while not adding too many unnecessary scans? If some processes are
Do you mean add all mm(s) to the scan list unconditionally? I don't
think it will scale.
> being scanned without any effect, maybe track success separately, and scan
> them less frequently etc. That could be ultimately more efficinet than
> painfully tracking just *eligibility* for scanning in "always" mode?
Sounds like we need a couple of different lists, for example, inactive
and active? And promote or demote mm(s) between the two lists? TBH I
don't see too many benefits at the moment. Or I misunderstood you?
>
> Even more radical thing to consider (maybe that's a LSF/MM level topic, too
> bad :) is that we scan pagetables in ksm, khugepaged, numa balancing, soon
> in MGLRU, and I probably forgot something else. Maybe time to think about
> unifying those scanners?
We do have pagewalk (walk_page_range()) which is used by a couple of
mm stuff, for example, mlock, mempolicy, mprotect, etc. I'm not sure
whether it is feasible for khugepaged, ksm, etc, or not since I didn't
look that hard. But I agree it should be worth looking at.
>
>
Powered by blists - more mailing lists