lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkbhK_BS03MfeM5br+5K2JwXayp9Tay71MqhJ4qdfocDMw@mail.gmail.com>
Date: Thu, 28 Mar 2024 11:45:26 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Chengming Zhou <chengming.zhou@...ux.dev>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Johannes Weiner <hannes@...xchg.org>, 
	Nhat Pham <nphamcs@...il.com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 7/9] mm: zswap: store zero-filled pages without a zswap_entry

On Thu, Mar 28, 2024 at 1:12 AM Chengming Zhou <chengming.zhou@...ux.dev> wrote:
>
> On 2024/3/26 07:50, Yosry Ahmed wrote:
> > After the rbtree to xarray conversion, and dropping zswap_entry.refcount
> > and zswap_entry.value, the only members of zswap_entry utilized by
> > zero-filled pages are zswap_entry.length (always 0) and
> > zswap_entry.objcg. Store the objcg pointer directly in the xarray as a
> > tagged pointer and avoid allocating a zswap_entry completely for
> > zero-filled pages.
> >
> > This simplifies the code as we no longer need to special case
> > zero-length cases. We are also able to further separate the zero-filled
> > pages handling logic and completely isolate them within store/load
> > helpers.  Handling tagged xarray pointers is handled in these two
> > helpers, as well as the newly introduced helper for freeing tree
> > elements, zswap_tree_free_element().
> >
> > There is also a small performance improvement observed over 50 runs of
> > kernel build test (kernbench) comparing the mean build time on a skylake
> > machine when building the kernel in a cgroup v1 container with a 3G
> > limit. This is on top of the improvement from dropping support for
> > non-zero same-filled pages:
> >
> >               base            patched         % diff
> > real            69.915          69.757                -0.229%
> > user            2956.147        2955.244      -0.031%
> > sys             2594.718        2575.747      -0.731%
> >
> > This probably comes from avoiding the zswap_entry allocation and
> > cleanup/freeing for zero-filled pages. Note that the percentage of
> > zero-filled pages during this test was only around 1.5% on average.
> > Practical workloads could have a larger proportion of such pages (e.g.
> > Johannes observed around 10% [1]), so the performance improvement should
> > be larger.
> >
> > This change also saves a small amount of memory due to less allocated
> > zswap_entry's. In the kernel build test above, we save around 2M of
> > slab usage when we swap out 3G to zswap.
> >
> > [1]https://lore.kernel.org/linux-mm/20240320210716.GH294822@cmpxchg.org/
> >
> > Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
>
> The code looks good, just one comment below.
>
> Reviewed-by: Chengming Zhou <chengming.zhou@...ux.dev>

Thanks!

>
> > ---
> >  mm/zswap.c | 137 ++++++++++++++++++++++++++++++-----------------------
> >  1 file changed, 78 insertions(+), 59 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 413d9242cf500..efc323bab2f22 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -183,12 +183,11 @@ static struct shrinker *zswap_shrinker;
> >   * struct zswap_entry
> >   *
> [..]
> >
> > @@ -1531,26 +1552,27 @@ bool zswap_load(struct folio *folio)
> >       struct page *page = &folio->page;
> >       struct xarray *tree = swap_zswap_tree(swp);
> >       struct zswap_entry *entry;
> > +     struct obj_cgroup *objcg;
> > +     void *elem;
> >
> >       VM_WARN_ON_ONCE(!folio_test_locked(folio));
> >
> > -     entry = xa_erase(tree, offset);
> > -     if (!entry)
> > +     elem = xa_erase(tree, offset);
> > +     if (!elem)
> >               return false;
> >
> > -     if (entry->length)
> > +     if (!zswap_load_zero_filled(elem, page, &objcg)) {
> > +             entry = elem;
>
> nit: entry seems no use anymore.

I left it here on purpose to avoid casting elem in the next two lines,
it is just more aesthetic.

>
> > +             objcg = entry->objcg;
> >               zswap_decompress(entry, page);
> > -     else
> > -             clear_highpage(page);
> > +     }
> >
> >       count_vm_event(ZSWPIN);
> > -     if (entry->objcg)
> > -             count_objcg_event(entry->objcg, ZSWPIN);
> > -
> > -     zswap_entry_free(entry);
> > +     if (objcg)
> > +             count_objcg_event(objcg, ZSWPIN);
> >
> > +     zswap_tree_free_element(elem);
> >       folio_mark_dirty(folio);
> > -
> >       return true;
> >  }
> [..]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ