linux-kernel - Re: [PATCH 0/2] RFC: zswap tree use xarray instead of RB tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAF8kJuNPPruLDOEqH-f-w1zw-TSuWkUZsQ43Qe_EtycapXgkbQ@mail.gmail.com>
Date: Wed, 17 Jan 2024 23:19:39 -0800
From: Chris Li <chrisl@...nel.org>
To: Yosry Ahmed <yosryahmed@...gle.com>
Cc: Chengming Zhou <zhouchengming@...edance.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	Wei Xu <weixugc@...gle.com>, Yu Zhao <yuzhao@...gle.com>, 
	Greg Thelen <gthelen@...gle.com>, Chun-Tse Shao <ctshao@...gle.com>, 
	Suren Baghdasaryan <surenb@...gle.com>, 
	Brain Geffon <bgeffon@...gle.com>, Minchan Kim <minchan@...nel.org>, Michal Hocko <mhocko@...e.com>, 
	Mel Gorman <mgorman@...hsingularity.net>, Huang Ying <ying.huang@...el.com>, 
	Nhat Pham <nphamcs@...il.com>, Johannes Weiner <hannes@...xchg.org>, Kairui Song <kasong@...cent.com>, 
	Zhongkun He <hezhongkun.hzk@...edance.com>, Kemeng Shi <shikemeng@...weicloud.com>, 
	Barry Song <v-songbaohua@...o.com>, "Matthew Wilcox (Oracle)" <willy@...radead.org>, 
	"Liam R. Howlett" <Liam.Howlett@...cle.com>, Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [PATCH 0/2] RFC: zswap tree use xarray instead of RB tree

On Wed, Jan 17, 2024 at 11:02 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
>
> On Wed, Jan 17, 2024 at 10:57 PM Chengming Zhou
> <zhouchengming@...edance.com> wrote:
> >
> > Hi Yosry and Chris,
> >
> > On 2024/1/18 14:39, Yosry Ahmed wrote:
> > > On Wed, Jan 17, 2024 at 10:01 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> > >>
> > >> That's a long CC list for sure :)
> > >>
> > >> On Wed, Jan 17, 2024 at 7:06 PM Chris Li <chrisl@...nel.org> wrote:
> > >>>
> > >>> The RB tree shows some contribution to the swap fault
> > >>> long tail latency due to two factors:
> > >>> 1) RB tree requires re-balance from time to time.
> > >>> 2) The zswap RB tree has a tree level spin lock protecting
> > >>> the tree access.
> > >>>
> > >>> The swap cache is using xarray. The break down the swap
> > >>> cache access does not have the similar long time as zswap
> > >>> RB tree.
> > >>
> > >> I think the comparison to the swap cache may not be valid as the swap
> > >> cache has many trees per swapfile, while zswap has a single tree.
> > >>
> > >>>
> > >>> Moving the zswap entry to xarray enable read side
> > >>> take read RCU lock only.
> > >>
> > >> Nice.
> > >>
> > >>>
> > >>> The first patch adds the xarray alongside the RB tree.
> > >>> There is some debug check asserting the xarray agrees with
> > >>> the RB tree results.
> > >>>
> > >>> The second patch removes the zwap RB tree.
> > >>
> > >> The breakdown looks like something that would be a development step,
> > >> but for patch submission I think it makes more sense to have a single
> > >> patch replacing the rbtree with an xarray.
> > >>
> > >>>
> > >>> I expect to merge the zswap rb tree spin lock with the xarray
> > >>> lock in the follow up changes.
> > >>
> > >> Shouldn't this simply be changing uses of tree->lock to use
> > >> xa_{lock/unlock}? We also need to make sure we don't try to lock the
> > >> tree when operating on the xarray if the caller is already holding the
> > >> lock, but this seems to be straightforward enough to be done as part
> > >> of this patch or this series at least.
> > >>
> > >> Am I missing something?
> > >
> > > Also, I assume we will only see performance improvements after the
> > > tree lock in its current form is removed so that we get loads
> > > protected only by RCU. Can we get some performance numbers to see how
> > > the latency improves with the xarray under contention (unless
> > > Chengming is already planning on testing this for his multi-tree
> > > patches).
> >
> > I just give it a try, the same test of kernel build in tmpfs with zswap
> > shrinker enabled, all based on the latest mm/mm-stable branch.
> >
> >                     mm-stable           zswap-split-tree    zswap-xarray
> > real                1m10.442s           1m4.157s            1m9.962s
> > user                17m48.232s          17m41.477s          17m45.887s
> > sys                 8m13.517s           5m2.226s            7m59.305s
> >
> > Looks like the contention of concurrency is still there, I haven't
> > look into the code yet, will review it later.

Thanks for the quick test. Interesting to see the sys usage drop for
the xarray case even with the spin lock.
Not sure if the 13 second saving is statistically significant or not.

We might need to have both xarray and split trees for the zswap. It is
likely removing the spin lock wouldn't be able to make up the 35%
difference. That is just my guess. There is only one way to find out.

BTW, do you have a script I can run to replicate your results?

>
> I think that's expected with the current version because the tree
> spin_lock is still there and we are still doing lookups with a
> spinlock.

Right.

Chris