linux-kernel - Re: [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkYFTQHn4vgNVt8k-aNsXE2Ahi82yprHU=nAZqRpLqQSNg@mail.gmail.com>
Date:   Thu, 3 Nov 2022 13:46:56 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Minchan Kim <minchan@...nel.org>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Nhat Pham <nphamcs@...il.com>, akpm@...ux-foundation.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        ngupta@...are.org, sjenning@...hat.com, ddstreet@...e.org,
        vitaly.wool@...sulko.com
Subject: Re: [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and
 size_class's locks

On Thu, Nov 3, 2022 at 1:37 PM Minchan Kim <minchan@...nel.org> wrote:
>
> On Thu, Nov 03, 2022 at 11:10:47AM -0700, Yosry Ahmed wrote:
> < snip >
>
> > > > > > I am also worry about that LRU stuff should be part of allocator
> > > > > > instead of higher level.
> > > > >
> > > > > I'm sorry, but that's not a reasonable objection.
> > > > >
> > > > > These patches implement a core feature of being a zswap backend, using
> > > > > standard LRU and locking techniques established by the other backends.
> > > > >
> > > > > I don't disagree that it would nicer if zswap had a strong abstraction
> > > > > for backend pages and a generalized LRU. But that is major surgery on
> > > > > a codebase of over 6,500 lines. It's not a reasonable ask to change
> > > > > all that first before implementing a basic feature that's useful now.
> > > >
> > > > With same logic, folks added the LRU logic into their allocators
> > > > without the effort considering moving the LRU into upper layer.
> > > >
> > > > And then trend is still going on since I have seen multiple times
> > > > people are trying to add more allocators. So if it's not a reasonable
> > > > ask to consier, we couldn't stop the trend in the end.
> > >
> > > So there is actually an ongoing effort to do that. Yosry and I have
> > > spent quite some time on coming up with an LRU design that's
> > > independent from compression policy over email and at Plumbers.
> > >
> > > My concern is more about the order of doing things:
> > >
> > > 1. The missing writeback support is a gaping hole in zsmalloc, which
> > >    affects production systems. A generalized LRU list is a good idea,
> > >    but it's a huge task that from a user pov really is not
> > >    critical. Even from a kernel dev / maintainer POV, there are bigger
> > >    fish to fry in the zswap code base and the backends than this.
> > >
> > > 2. Refactoring existing functionality is much easier than writing
> > >    generalized code that simultaneously enables new behavior. zsmalloc
> > >    is the most complex of our backends. To make its LRU writeback work
> > >    we had to patch zswap's ->map ordering to accomodate it, e.g. Such
> > >    tricky changes are easier to make and test incrementally.
> > >
> > >    The generalized LRU project will hugely benefit from already having
> > >    a proven writeback implementation in zsmalloc, because then all the
> > >    requirements in zswap and zsmalloc will be in black and white.
> > >
> > > > > I get that your main interest is zram, and so this feature isn't of
> > > > > interest to you. But zram isn't the only user, nor is it the primary
> > > >
> > > > I am interest to the feature but my interest is more of general swap
> > > > layer to manage the LRU so that it could support any hierarchy among
> > > > swap devices, not only zswap.
> > >
> > > I think we're on the same page about the longer term goals.
> > >
> >
> > Yeah. As Johannes said, I was also recently looking into this. This
> > can also help solve other problems than consolidating implementations.
> > Currently if zswap rejects a page, it goes into swap, which is
> > more-or-less a violation of page LRUs since hotter pages that are more
> > recently reclaimed end up in swap (slow), while colder pages that were
> > reclaimed before are in zswap. Having a separate layer managing the
> > LRU of swap pages can also make sure this doesn't happen.
>
> True.
>
> >
> > More broadly, making zswap a separate layer from swap enables other
> > improvements such as using zswap regardless of the presence of a
> > backend swapfile and not consuming space in swapfiles if a page is in
> > zswap. Of course, this is a much larger surgery.
>
> If we could decouple the LRU writeback from zswap and supports
> compression without backing swapfile, sounds like becoming more of
> zram. ;-)

That's a little bit grey. Maybe we can consolidate them one day :)

We have been using zswap without swapfile at Google for a while, this
gives us the ability to reject pages that do not compress well enough
for us, which I suspect zram would not support given that it is
designed to be the final destination of the page. Also, having the
same configuration and code running on machines whether or not they
have a swapfile is nice, otherwise one would need to use zram if there
is no swapfile and switch to zswap if there is one.

>
> >
> > I am intending to spend more time looking further into this, but other
> > things keep popping up :)
>
> Same with me. Thanks for looking it, Yosry!