[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ0PR11MB567870784D380DE5EDB29AEBC9762@SJ0PR11MB5678.namprd11.prod.outlook.com>
Date: Mon, 30 Sep 2024 17:55:44 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Yosry Ahmed <yosryahmed@...gle.com>, Johannes Weiner <hannes@...xchg.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>, "nphamcs@...il.com"
<nphamcs@...il.com>, "chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>,
"usamaarif642@...il.com" <usamaarif642@...il.com>, "shakeel.butt@...ux.dev"
<shakeel.butt@...ux.dev>, "ryan.roberts@....com" <ryan.roberts@....com>,
"Huang, Ying" <ying.huang@...el.com>, "21cnbao@...il.com"
<21cnbao@...il.com>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@...el.com>, "Feghali, Wajdi K"
<wajdi.k.feghali@...el.com>, "Gopal, Vinodh" <vinodh.gopal@...el.com>,
"Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v8 6/8] mm: zswap: Support large folios in zswap_store().
> -----Original Message-----
> From: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Sent: Sunday, September 29, 2024 2:15 PM
> To: Yosry Ahmed <yosryahmed@...gle.com>; Johannes Weiner
> <hannes@...xchg.org>
> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> nphamcs@...il.com; chengming.zhou@...ux.dev;
> usamaarif642@...il.com; shakeel.butt@...ux.dev; ryan.roberts@....com;
> Huang, Ying <ying.huang@...el.com>; 21cnbao@...il.com; akpm@...ux-
> foundation.org; Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>;
> Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Subject: RE: [PATCH v8 6/8] mm: zswap: Support large folios in zswap_store().
>
> > -----Original Message-----
> > From: Yosry Ahmed <yosryahmed@...gle.com>
> > Sent: Saturday, September 28, 2024 11:11 AM
> > To: Johannes Weiner <hannes@...xchg.org>
> > Cc: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>; linux-
> > kernel@...r.kernel.org; linux-mm@...ck.org; nphamcs@...il.com;
> > chengming.zhou@...ux.dev; usamaarif642@...il.com;
> > shakeel.butt@...ux.dev; ryan.roberts@....com; Huang, Ying
> > <ying.huang@...el.com>; 21cnbao@...il.com; akpm@...ux-
> foundation.org;
> > Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> > <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>
> > Subject: Re: [PATCH v8 6/8] mm: zswap: Support large folios in
> zswap_store().
> >
> > On Sat, Sep 28, 2024 at 7:15 AM Johannes Weiner <hannes@...xchg.org>
> > wrote:
> > >
> > > On Fri, Sep 27, 2024 at 08:42:16PM -0700, Yosry Ahmed wrote:
> > > > On Fri, Sep 27, 2024 at 7:16 PM Kanchana P Sridhar
> > > > > {
> > > > > + struct page *page = folio_page(folio, index);
> > > > > swp_entry_t swp = folio->swap;
> > > > > - pgoff_t offset = swp_offset(swp);
> > > > > struct xarray *tree = swap_zswap_tree(swp);
> > > > > + pgoff_t offset = swp_offset(swp) + index;
> > > > > struct zswap_entry *entry, *old;
> > > > > - struct obj_cgroup *objcg = NULL;
> > > > > - struct mem_cgroup *memcg = NULL;
> > > > > -
> > > > > - VM_WARN_ON_ONCE(!folio_test_locked(folio));
> > > > > - VM_WARN_ON_ONCE(!folio_test_swapcache(folio));
> > > > > + int type = swp_type(swp);
> > > >
> > > > Why do we need type? We use it when initializing entry->swpentry to
> > > > reconstruct the swp_entry_t we already have.
> > >
> > > It's not the same entry. folio->swap points to the head entry, this
> > > function has to store swap entries with the offsets of each subpage.
> >
> > Duh, yeah, thanks.
> >
> > >
> > > Given the name of this function, it might be better to actually pass a
> > > page pointer to it; do the folio_page() inside zswap_store().
> > >
> > > Then do
> > >
> > > entry->swpentry = page_swap_entry(page);
> > >
> > > below.
> >
> > That is indeed clearer.
> >
> > Although this will be adding yet another caller of page_swap_entry()
> > that already has the folio, yet it calls page_swap_entry() for each
> > page in the folio, which calls page_folio() inside.
> >
> > I wonder if we should add (or replace page_swap_entry()) with a
> > folio_swap_entry(folio, index) helper. This can also be done as a
> > follow up.
>
> Thanks Johannes and Yosry for these comments. I was thinking about
> this some more. In its current form, zswap_store_page() is called in the
> context
> of the folio by passing in a [folio, index]. This implies a key assumption about
> the existing zswap_store() large folios functionality, i.e., we do the per-page
> store for the page at a "index * PAGE_SIZE" within the folio, and not for any
> arbitrary page. Further, we need the folio for folio_nid(); but this can also be
> computed from the page. Another reason why I thought the existing signature
> might be preferable is because it seems like it enables getting the entry's
> swp_entry_t with fewer computes. Could calling page_swap_entry() add
> more computes; which if it is the case, could potentially add up (say 512
> times)
I went ahead and quantified this with the v8 signature of zswap_store_page()
and the suggested changes for this function to take a page and use
page_swap_entry(). I ran usemem with 2M pmd-mappable folios enabled.
The results indicate that the page_swap_entry() implementation is slightly
better in throughput and latency:
v8: run1 run2 run3 average
---------------------------------------------------------------------
Total throughput (KB/s): 6,483,835 6,396,760 6,349,532 6,410,042
Average throughput (KB/s): 216,127 213,225 211,651 213,889
elapsed time (sec): 107.75 107.06 109.99 108.87
sys time (sec): 2,476.43 2,453.99 2,551.52 2,513.98
---------------------------------------------------------------------
page_swap_entry(): run1 run2 run3 average
---------------------------------------------------------------------
Total throughput (KB/s): 6,462,954 6,396,134 6,418,076 6,425,721
Average throughput (KB/s): 215,431 213,204 213,935 214,683
elapsed time (sec): 108.67 109.46 107.91 108.29
sys time (sec): 2,473.65 2,493.33 2,507.82 2,490.74
---------------------------------------------------------------------------
Based on this, I will go ahead and implement the change suggested
by Johannes and submit a v9.
Thanks,
Kanchana
>
> I would appreciate your thoughts on whether these are valid considerations,
> and can proceed accordingly.
>
> >
> > >
> > > > > obj_cgroup_put(objcg);
> > > > > - if (zswap_pool_reached_full)
> > > > > - queue_work(shrink_wq, &zswap_shrink_work);
> > > > > -check_old:
> > > > > + return false;
> > > > > +}
> > > > > +
> > > > > +bool zswap_store(struct folio *folio)
> > > > > +{
> > > > > + long nr_pages = folio_nr_pages(folio);
> > > > > + swp_entry_t swp = folio->swap;
> > > > > + struct xarray *tree = swap_zswap_tree(swp);
> > > > > + pgoff_t offset = swp_offset(swp);
> > > > > + struct obj_cgroup *objcg = NULL;
> > > > > + struct mem_cgroup *memcg = NULL;
> > > > > + struct zswap_pool *pool;
> > > > > + size_t compressed_bytes = 0;
> > > >
> > > > Why size_t? entry->length is int.
> > >
> > > In light of Willy's comment, I think size_t is a good idea.
> >
> > Agreed.
>
> Thanks Yosry, Matthew and Johannes for the resolution on this!
>
> Thanks,
> Kanchana
Powered by blists - more mailing lists