[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4yqLVvCUFpHjWmNAYvPRMzGK8JJWYMXJLR7d9UhKp+QDA@mail.gmail.com>
Date: Tue, 30 Jul 2024 08:03:05 +1200
From: Barry Song <21cnbao@...il.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org, ying.huang@...el.com,
baolin.wang@...ux.alibaba.com, chrisl@...nel.org, david@...hat.com,
hannes@...xchg.org, hughd@...gle.com, kaleshsingh@...gle.com,
kasong@...cent.com, linux-kernel@...r.kernel.org, mhocko@...e.com,
minchan@...nel.org, nphamcs@...il.com, ryan.roberts@....com,
senozhatsky@...omium.org, shakeel.butt@...ux.dev, shy828301@...il.com,
surenb@...gle.com, v-songbaohua@...o.com, xiang@...nel.org,
yosryahmed@...gle.com, Chuanhua Han <hanchuanhua@...o.com>
Subject: Re: [PATCH v5 3/4] mm: support large folios swapin as a whole for
zRAM-like swapfile
On Tue, Jul 30, 2024 at 3:13 AM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Tue, Jul 30, 2024 at 01:11:31AM +1200, Barry Song wrote:
> > for this zRAM case, it is a new allocated large folio, only
> > while all conditions are met, we will allocate and map
> > the whole folio. you can check can_swapin_thp() and
> > thp_swap_suitable_orders().
>
> YOU ARE DOING THIS WRONGLY!
>
> All of you anonymous memory people are utterly fixated on TLBs AND THIS
> IS WRONG. Yes, TLB performance is important, particularly with crappy
> ARM designs, which I know a lot of you are paid to work on. But you
> seem to think this is the only consideration, and you're making bad
> design choices as a result. It's overly complicated, and you're leaving
> performance on the table.
>
> Look back at the results Ryan showed in the early days of working on
> large anonymous folios. Half of the performance win on his system came
> from using larger TLBs. But the other half came from _reduced software
> overhead_. The LRU lock is a huge problem, and using large folios cuts
> the length of the LRU list, hence LRU lock hold time.
>
> Your _own_ data on how hard it is to get hold of a large folio due to
> fragmentation should be enough to convince you that the more large folios
> in the system, the better the whole system runs. We should not decline to
> allocate large folios just because they can't be mapped with a single TLB!
I am not convinced. for a new allocated large folio, even alloc_anon_folio()
of do_anonymous_page() does the exactly same thing
alloc_anon_folio()
{
/*
* Get a list of all the (large) orders below PMD_ORDER that are enabled
* for this vma. Then filter out the orders that can't be allocated over
* the faulting address and still be fully contained in the vma.
*/
orders = thp_vma_allowable_orders(vma, vma->vm_flags,
TVA_IN_PF | TVA_ENFORCE_SYSFS, BIT(PMD_ORDER) - 1);
orders = thp_vma_suitable_orders(vma, vmf->address, orders);
}
you are not going to allocate a mTHP for an unaligned address for a new
PF.
Please point out where it is wrong.
Thanks
Barry
Powered by blists - more mailing lists