[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkYBaHpJG8Uzo592cXYwNbzRJ0G8Ju71mkFA+T8uS3eARg@mail.gmail.com>
Date: Wed, 28 Aug 2024 18:01:45 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
"hannes@...xchg.org" <hannes@...xchg.org>, "nphamcs@...il.com" <nphamcs@...il.com>,
"ryan.roberts@....com" <ryan.roberts@....com>, "Huang, Ying" <ying.huang@...el.com>,
"21cnbao@...il.com" <21cnbao@...il.com>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@...el.com>, "Feghali, Wajdi K" <wajdi.k.feghali@...el.com>,
"Gopal, Vinodh" <vinodh.gopal@...el.com>, Chris Li <chrisl@...nel.org>
Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
[..]
> > > In the "Before" scenario, when zswap does not store mTHP, only allocations
> > > count towards the cgroup memory limit. However, in the "After" scenario,
> > > with the introduction of zswap_store() mTHP, both, allocations as well as
> > > the zswap compressed pool usage from all 70 processes are counted
> > towards
> > > the memory limit. As a result, we see higher swapout activity in the
> > > "After" data. Hence, more time is spent doing reclaim as the zswap cgroup
> > > charge leads to more frequent memory.high breaches.
> > >
> > > This causes degradation in throughput and sys time with zswap mTHP, more
> > so
> > > in case of zstd than deflate-iaa. Compress latency could play a part in
> > > this - when there is more swapout activity happening, a slower compressor
> > > would cause allocations to stall for any/all of the 70 processes.
> > >
> > > In my opinion, even though the test set up does not provide an accurate
> > > way for a direct before/after comparison (because of zswap usage being
> > > counted in cgroup, hence towards the memory.high), it still seems
> > > reasonable for zswap_store to support (m)THP, so that further performance
> > > improvements can be implemented.
> >
> > Are you saying that in the "Before" data we end up skipping zswap
> > completely because of using mTHPs?
>
> That's right, Yosry.
>
> >
> > Does it make more sense to turn CONFIG_THP_SWAP in the "Before" data
>
> We could do this, however I am not sure if turning off CONFIG_THP_SWAP
> will have other side-effects in terms of disabling mm code paths outside of
> zswap that are intended to be mTHP optimizations that could again skew
> the before/after comparisons.
Yeah that's possible, but right now we are testing mTHP swapout that
does not go through zswap at all vs. mTHP swapout going through zswap.
I think what we really want to measure is 4K swapout going through
zswap vs. mTHP swapout going through zswap. This assumes that current
zswap setups disable CONFIG_THP_SWAP, so we would be measuring the
benefit of allowing them to enable CONFIG_THP_SWAP by supporting it
properly in zswap.
If some setups with zswap have CONFIG_THP_SWAP enabled then that's a
different story, but we already have the data for this case as well
right now in case this is a legitimate setup.
Adding Chris Li here from Google. We have CONFIG_THP_SWAP disabled
with zswap, so for us we would want to know the benefit of supporting
CONFIG_THP_SWAP properly in zswap. At least I think so :)
>
> Will wait for Nhat's comments as well.
>
> Thanks,
> Kanchana
>
> > to force the mTHPs to be split and for the data to be stored in zswap?
> > This would be a more fair Before/After comparison where the memory
> > goes to zswap in both cases, but "Before" has to be split because of
> > zswap's lack of support for mTHP. I assume most setups relying on
> > zswap will be turning CONFIG_THP_SWAP off today anyway, but maybe not.
> > Nhat, is this something you can share?
Powered by blists - more mailing lists