[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkYmOCCR7iipEFF69woXQxOr7byj+GuafGQVBL=UCHUK5A@mail.gmail.com>
Date: Thu, 11 Jan 2024 03:27:44 -0800
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Zhongkun He <hezhongkun.hzk@...edance.com>
Cc: Nhat Pham <nphamcs@...il.com>, akpm@...ux-foundation.org, hannes@...xchg.org,
sjenning@...hat.com, ddstreet@...e.org, vitaly.wool@...sulko.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Chris Li <chrisl@...nel.org>, Matthew Wilcox <willy@...radead.org>
Subject: Re: [External] Re: [PATCH] mm: zswap: fix the lack of page lru flag
in zswap_writeback_entry
On Wed, Jan 10, 2024 at 7:49 PM Zhongkun He
<hezhongkun.hzk@...edance.com> wrote:
>
> >
> > This sounds dangerous. This is going to introduce a rather large
> > unexpected side effect - we're changing the readahead behavior in a
> > seemingly small zswap optimization. In fact, I'd argue that if we do
> > this, the readahead behavior change will be the "main effect", and the
> > zswap-side change would be a "happy consequence". We should run a lot
> > of benchmarking and document the change extensively if we pursue this
> > route.
> >
>
> I agree with the unexpected side effect, and here I need
> to clarify the original intention of this patch.Please see the memory
> offloading steps below.
>
>
> memory zswap(reclaim) memory+swap (writeback)
> 1G 0.5G 1G(tmp memory) + 1G(swap)
>
> If the decompressed memory cannot be released in time,
> zswap's writeback has great side effects(mostly clod pages).
> On the one hand, the memory space has not been reduced,
> but has increased (from 0.5G->1G).
> At the same time, it is not put the pages to the tail of the lru.
> When the memory is insufficient, other pages will be squeezed out
> and released early.
> With this patch, we can put the tmp pages to the tail and reclaim it
> in time when the memory is insufficient or actively reclaimed.
> So I think this patch makes sense and hope it can be fixed with a
> suitable approaches.
>
> >
> > Unless some page flag/readahead expert can confirm that the first
> > option is safe, my vote is on this option. I mean, it's fairly minimal
> > codewise, no? Just a bunch of plumbing. We can also keep the other
> > call sites intact if we just rename the old versions - something along
> > the line of:
> >
> > __read_swap_cache_async_head(..., bool add_to_lru_head)
> > {
> > ...
> > if (add_to_lru_head)
> > folio_add_lru(folio)
> > else
> > folio_add_lru_tail(folio);
> > }
> >
> > __read_swap_cache_async(...)
> > {
> > return __read_swap_cache_async_tail(..., true);
> > }
> >
> > A bit boilerplate? Sure. But this seems safer, and I doubt it's *that*
> > much more work.
> >
>
> Yes, agree. I will try it again.
I agree with Nhat here. Unless someone with enough readahead/page
flags knowledge says putting PG_readahead pages at the tail of the LRU
is okay (doubtful), I think we should opt for introducing a
folio_add_lru_tail() as I initially suggested.
Powered by blists - more mailing lists