[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufa21rX1ODOZ0iJxiy4pstgGqhkbfYozg8-+sRp5ZxAOjA@mail.gmail.com>
Date: Tue, 27 Jun 2023 12:33:09 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Yin Fengwei <fengwei.yin@...el.com>,
David Hildenbrand <david@...hat.com>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-alpha@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-ia64@...r.kernel.org,
linux-m68k@...ts.linux-m68k.org, linux-s390@...r.kernel.org
Subject: Re: [PATCH v1 10/10] mm: Allocate large folios for anonymous memory
On Tue, Jun 27, 2023 at 3:57 AM Ryan Roberts <ryan.roberts@....com> wrote:
>
> On 27/06/2023 04:01, Yu Zhao wrote:
> > On Mon, Jun 26, 2023 at 11:15 AM Ryan Roberts <ryan.roberts@....com> wrote:
> >>
> >> With all of the enabler patches in place, modify the anonymous memory
> >> write allocation path so that it opportunistically attempts to allocate
> >> a large folio up to `max_anon_folio_order()` size (This value is
> >> ultimately configured by the architecture). This reduces the number of
> >> page faults, reduces the size of (e.g. LRU) lists, and generally
> >> improves performance by batching what were per-page operations into
> >> per-(large)-folio operations.
> >>
> >> If CONFIG_LARGE_ANON_FOLIO is not enabled (the default) then
> >> `max_anon_folio_order()` always returns 0, meaning we get the existing
> >> allocation behaviour.
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
> >> ---
> >> mm/memory.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++-----
> >> 1 file changed, 144 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/mm/memory.c b/mm/memory.c
> >> index a8f7e2b28d7a..d23c44cc5092 100644
> >> --- a/mm/memory.c
> >> +++ b/mm/memory.c
> >> @@ -3161,6 +3161,90 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma)
> >> return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX;
> >> }
> >>
> >> +/*
> >> + * Returns index of first pte that is not none, or nr if all are none.
> >> + */
> >> +static inline int check_ptes_none(pte_t *pte, int nr)
> >> +{
> >> + int i;
> >> +
> >> + for (i = 0; i < nr; i++) {
> >> + if (!pte_none(ptep_get(pte++)))
> >> + return i;
> >> + }
> >> +
> >> + return nr;
> >> +}
> >> +
> >> +static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order)
> >
> > As suggested previously in 03/10, we can leave this for later.
>
> I disagree. This is the logic that prevents us from accidentally replacing
> already set PTEs, or wandering out of the VMA bounds etc. How would you catch
> all those corener cases without this?
Again, sorry for not being clear previously: we definitely need to
handle alignments & overlapps. But the fallback, i.e., "for (; order >
1; order--) {" in calc_anon_folio_order_alloc() is not necessary.
For now, we just need something like
bool is_order_suitable() {
// check whether it fits properly
}
Later on, we could add
alloc_anon_folio_best_effort()
{
for a list of fallback orders
is_order_suitable()
}
Powered by blists - more mailing lists