lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ0PR11MB56780CF8B669BE97DEAE8DCAC9962@SJ0PR11MB5678.namprd11.prod.outlook.com>
Date: Thu, 29 Aug 2024 03:10:36 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Yosry Ahmed <yosryahmed@...gle.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "hannes@...xchg.org"
	<hannes@...xchg.org>, "nphamcs@...il.com" <nphamcs@...il.com>,
	"ryan.roberts@....com" <ryan.roberts@....com>, "Huang, Ying"
	<ying.huang@...el.com>, "21cnbao@...il.com" <21cnbao@...il.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "Zou, Nanhai"
	<nanhai.zou@...el.com>, "Feghali, Wajdi K" <wajdi.k.feghali@...el.com>,
	"Gopal, Vinodh" <vinodh.gopal@...el.com>, Chris Li <chrisl@...nel.org>,
	"Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios

Hi Yosry,

> -----Original Message-----
> From: Yosry Ahmed <yosryahmed@...gle.com>
> Sent: Wednesday, August 28, 2024 6:02 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> hannes@...xchg.org; nphamcs@...il.com; ryan.roberts@....com;
> Huang, Ying <ying.huang@...el.com>; 21cnbao@...il.com; akpm@...ux-
> foundation.org; Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>; Chris
> Li <chrisl@...nel.org>
> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
> 
> [..]
> > > > In the "Before" scenario, when zswap does not store mTHP, only
> allocations
> > > > count towards the cgroup memory limit. However, in the "After"
> scenario,
> > > > with the introduction of zswap_store() mTHP, both, allocations as well as
> > > > the zswap compressed pool usage from all 70 processes are counted
> > > towards
> > > > the memory limit. As a result, we see higher swapout activity in the
> > > > "After" data. Hence, more time is spent doing reclaim as the zswap
> cgroup
> > > > charge leads to more frequent memory.high breaches.
> > > >
> > > > This causes degradation in throughput and sys time with zswap mTHP,
> more
> > > so
> > > > in case of zstd than deflate-iaa. Compress latency could play a part in
> > > > this - when there is more swapout activity happening, a slower
> compressor
> > > > would cause allocations to stall for any/all of the 70 processes.
> > > >
> > > > In my opinion, even though the test set up does not provide an accurate
> > > > way for a direct before/after comparison (because of zswap usage being
> > > > counted in cgroup, hence towards the memory.high), it still seems
> > > > reasonable for zswap_store to support (m)THP, so that further
> performance
> > > > improvements can be implemented.
> > >
> > > Are you saying that in the "Before" data we end up skipping zswap
> > > completely because of using mTHPs?
> >
> > That's right, Yosry.
> >
> > >
> > > Does it make more sense to turn CONFIG_THP_SWAP in the "Before" data
> >
> > We could do this, however I am not sure if turning off CONFIG_THP_SWAP
> > will have other side-effects in terms of disabling mm code paths outside of
> > zswap that are intended to be mTHP optimizations that could again skew
> > the before/after comparisons.
> 
> Yeah that's possible, but right now we are testing mTHP swapout that
> does not go through zswap at all vs. mTHP swapout going through zswap.
> 
> I think what we really want to measure is 4K swapout going through
> zswap vs. mTHP swapout going through zswap. This assumes that current
> zswap setups disable CONFIG_THP_SWAP, so we would be measuring the
> benefit of allowing them to enable CONFIG_THP_SWAP by supporting it
> properly in zswap.
> 
> If some setups with zswap have CONFIG_THP_SWAP enabled then that's a
> different story, but we already have the data for this case as well
> right now in case this is a legitimate setup.
> 
> Adding Chris Li here from Google. We have CONFIG_THP_SWAP disabled
> with zswap, so for us we would want to know the benefit of supporting
> CONFIG_THP_SWAP properly in zswap. At least I think so :)

Sure, this makes sense. Here's the data that I gathered with CONFIG_THP_SWAP
disabled. We see improvements overall in throughput and sys time for zstd and
deflate-iaa, when comparing before (THP_SWAP=N) vs. after (THP_SWAP=Y):

64K mTHP:
=========

 -------------------------------------------------------------------------------
                       v6.11-rc3 mainline              zswap-mTHP     Change wrt
                                 Baseline                               Baseline
                        CONFIG_THP_SWAP=N       CONFIG_THP_SWAP=Y
--------------------------------------------------------------------------------
 ZSWAP compressor       zstd     deflate-        zstd    deflate-  zstd deflate-
                                      iaa                     iaa            iaa
 -------------------------------------------------------------------------------
 Throughput (KB/s)   136,113      140,044     140,363     151,938    3%       8%
 sys time (sec)       986.78       951.95      954.85      735.47    3%      23%
 memcg_high          124,183      127,513     138,651     133,884
 memcg_swap_high           0            0           0           0
 memcg_swap_fail     619,020      751,099           0           0
 pswpin                    0            0           0           0
 pswpout                   0            0           0           0
 zswpin                  656          569         624         639
 zswpout           9,413,603   11,284,812   9,453,761   9,385,910
 thp_swpout                0            0           0           0
 thp_swpout_               0            0           0           0
  fallback
 pgmajfault            3,470        3,382       4,633       3,611
 ZSWPOUT-64kB            n/a          n/a     590,768     586,521
 SWPOUT-64kB               0            0           0           0
 -------------------------------------------------------------------------------


2M THP:
=======

 ------------------------------------------------------------------------------
                       v6.11-rc3 mainline              zswap-mTHP    Change wrt
                                 Baseline                              Baseline
                        CONFIG_THP_SWAP=N       CONFIG_THP_SWAP=Y
 ------------------------------------------------------------------------------
 ZSWAP compressor       zstd    deflate-        zstd    deflate-  zstd deflate-
                                     iaa                     iaa            iaa
 ------------------------------------------------------------------------------
 Throughput (KB/s)    164,220    172,523      165,005     174,536  0.5%      1%
 sys time (sec)        855.76     686.94       801.72      676.65    6%      1%
 memcg_high            14,628     16,247       14,951      16,096
 memcg_swap_high            0          0            0           0
 memcg_swap_fail       18,698     21,114            0           0
 pswpin                     0          0            0           0
 pswpout                    0          0            0           0
 zswpin                   663        665        5,333         781
 zswpout            8,419,458  8,992,065    8,546,895   9,355,760
 thp_swpout                 0          0            0           0
 thp_swpout_           18,697     21,113            0           0
  fallback
 pgmajfault             3,439      3,496        8,139       3,582
 ZSWPOUT-2048kB           n/a        n/a       16,684      18,270
 SWPOUT-2048kB              0          0            0           0
 -----------------------------------------------------------------------------

Thanks,
Kanchana

> 
> >
> > Will wait for Nhat's comments as well.
> >
> > Thanks,
> > Kanchana
> >
> > > to force the mTHPs to be split and for the data to be stored in zswap?
> > > This would be a more fair Before/After comparison where the memory
> > > goes to zswap in both cases, but "Before" has to be split because of
> > > zswap's lack of support for mTHP. I assume most setups relying on
> > > zswap will be turning CONFIG_THP_SWAP off today anyway, but maybe
> not.
> > > Nhat, is this something you can share?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ