lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ0PR11MB5678B5DFF28E5F21CEC81B11C96C2@SJ0PR11MB5678.namprd11.prod.outlook.com>
Date: Fri, 20 Sep 2024 02:34:58 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Chengming Zhou <chengming.zhou@...ux.dev>, Nhat Pham <nphamcs@...il.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "hannes@...xchg.org"
	<hannes@...xchg.org>, "yosryahmed@...gle.com" <yosryahmed@...gle.com>,
	"ryan.roberts@....com" <ryan.roberts@....com>, "Huang, Ying"
	<ying.huang@...el.com>, "21cnbao@...il.com" <21cnbao@...il.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "Zou, Nanhai"
	<nanhai.zou@...el.com>, "Feghali, Wajdi K" <wajdi.k.feghali@...el.com>,
	"Gopal, Vinodh" <vinodh.gopal@...el.com>, Usama Arif
	<usamaarif642@...il.com>, "Sridhar, Kanchana P"
	<kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios

Hi Chengming,

> -----Original Message-----
> From: Chengming Zhou <chengming.zhou@...ux.dev>
> Sent: Thursday, August 29, 2024 9:52 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>; Nhat Pham
> <nphamcs@...il.com>
> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> hannes@...xchg.org; yosryahmed@...gle.com; ryan.roberts@....com;
> Huang, Ying <ying.huang@...el.com>; 21cnbao@...il.com; akpm@...ux-
> foundation.org; Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>;
> Usama Arif <usamaarif642@...il.com>
> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
> 
> On 2024/8/30 03:38, Sridhar, Kanchana P wrote:
> > Hi Nhat,
> >
> >> -----Original Message-----
> >> From: Nhat Pham <nphamcs@...il.com>
> >> Sent: Thursday, August 29, 2024 10:11 AM
> >> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> >> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> >> hannes@...xchg.org; yosryahmed@...gle.com; ryan.roberts@....com;
> >> Huang, Ying <ying.huang@...el.com>; 21cnbao@...il.com; akpm@...ux-
> >> foundation.org; Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> >> <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>;
> >> Usama Arif <usamaarif642@...il.com>; Chengming Zhou
> >> <chengming.zhou@...ux.dev>
> >> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
> >>
> >> On Wed, Aug 28, 2024 at 5:06 PM Sridhar, Kanchana P
> >> <kanchana.p.sridhar@...el.com> wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Nhat Pham <nphamcs@...il.com>
> >>>> Sent: Wednesday, August 28, 2024 2:35 PM
> >>>> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> >>>> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> >>>> hannes@...xchg.org; yosryahmed@...gle.com;
> >> ryan.roberts@....com;
> >>>> Huang, Ying <ying.huang@...el.com>; 21cnbao@...il.com;
> akpm@...ux-
> >>>> foundation.org; Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> >>>> <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>
> >>>> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
> >>>>
> >>>> On Wed, Aug 28, 2024 at 2:35 AM Kanchana P Sridhar
> >>>> <kanchana.p.sridhar@...el.com> wrote:
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> This patch-series enables zswap_store() to accept and store mTHP
> >>>>> folios. The most significant contribution in this series is from the
> >>>>> earlier RFC submitted by Ryan Roberts [1]. Ryan's original RFC has
> been
> >>>>> migrated to v6.11-rc3 in patch 2/4 of this series.
> >>>>>
> >>>>> [1]: [RFC PATCH v1] mm: zswap: Store large folios without splitting
> >>>>>       https://lore.kernel.org/linux-mm/20231019110543.3284654-1-
> >>>> ryan.roberts@....com/T/#u
> >>>>>
> >>>>> Additionally, there is an attempt to modularize some of the
> functionality
> >>>>> in zswap_store(), to make it more amenable to supporting any-order
> >>>>> mTHPs. For instance, the function zswap_store_entry() stores a
> >>>> zswap_entry
> >>>>> in the xarray. Likewise, zswap_delete_stored_offsets() can be used to
> >>>>> delete all offsets corresponding to a higher order folio stored in zswap.
> >>>>>
> >>>>
> >>>> Will this have any conflict with mTHP swap work? Especially with mTHP
> >>>> swap-in and zswap writeback.
> >>>>
> >>>> My understanding is from zswap's perspective, the large folio is
> >>>> broken apart into independent subpages, correct? What happens when
> >> we
> >>>> have partially written back mTHP (i.e some subpages are in zswap
> >>>> still, whereas others are written back to swap). Would this
> >>>> automatically prevent mTHP swapin?
> >>>
> >>> That is a good point. To begin with, this patch-series would make the
> default
> >>> behavior for mTHP swapout/storage and swapin for ZSWAP to be on par
> >> with
> >>> ZRAM. From zswap's perspective, imo this is a significant step forward
> >> towards
> >>> realizing cold memory storage with mTHP folios. However, it is only a
> >> starting
> >>> point that makes the behavior uniform across zswap/zram. Initially,
> >> workloads
> >>> would see a one-time benefit with reclaim being able to swapout mTHP
> >>> folios without splitting, to zswap. If the mTHPs were cold memory, then
> we
> >>> would have derived latency gains towards memory savings (with zswap).
> >>>
> >>> However, if the mTHP were part of "not so cold" memory, this would
> result
> >>> in a one-way mTHP conversion to 4K folios. Depending on workloads and
> >> their
> >>> access patterns, we could either see individual 4K folios being swapped in,
> >>> or entire chunks if not the entire (original) mTHP needing to be swapped
> in.
> >>>
> >>> It should be noted that this is more of a performance vs. cold memory
> >>> preservation trade-off that needs to drive mTHP reclaim, storage, swapin
> >> and
> >>> writeback policy. Different workloads could require different policies.
> >> However,
> >>> even though this patch is only a starting point, it is still functionally
> correct
> >>> by being equivalent to zram-mTHP, and compatible with the rest of mm
> and
> >>> swap as far as mTHP. Another important functionality/data consistency
> >> decision
> >>> I made in this patch series is error handling during zswap_store() of
> mTHP:
> >>> in case of any errors, all swap offsets for the mTHP are deleted from the
> >>> zswap xarray/zpool, since we know that the mTHP will now have to be
> >> stored
> >>> in the backing swap device. IOW, an mTHP is either entirely stored in
> zswap,
> >>> or entirely not stored in zswap.
> >>>
> >>> To answer your question, we would need to come up with what the
> >> semantics
> >>> would need to be for zswap zpool storage granularity, swapin granularity,
> >>> readahead granularity and writeback wrt mTHP and how the overall
> swap
> >>> sub-system needs to "preserve" mTHP vs. splitting mTHP into 4K/lower-
> >> order
> >>> folios during swapout. Once we have a good understanding of these
> policies,
> >>> we could implement them in zswap. Alternately, develop an abstraction
> that
> >> is
> >>> one level above zswap/zram and makes things easier and shareable
> >> between
> >>> zswap and zram. By this, I mean fundamental assumptions such as
> >> consecutive
> >>> swap offsets (for instance). To some extent, this implies that an mTHP as
> a
> >>> swap entity is defined by consecutiveness of swap offsets. Maybe the
> policy
> >>> to keep mTHPs in the system over extended duration might be to
> assemble
> >>> them dynamically based on swapin_readahead() decisions (which is
> based
> >> on
> >>> workload access patterns). In other words, mTHPs could be a useful
> >> abstraction
> >>> that can be static or even dynamic based on working set characteristics,
> and
> >>> cold memory preservation. This is quite a complex topic imho.
> >>>
> >>> As we know, Barry Song and Chuanhua Han have started the discussion
> on
> >>> this in their zram mTHP swapin series [1].
> >>
> >> Yeah I'm a bit more concerned with the correctness aspect. As long as
> >> it's not buggy, then we can implement mTHP zswapout first, and force
> >> individual subpage (z)swapin for now (since we cannot control
> >> writeback from writing individual subpages).
> >
> > Absolutely, this sounds like the way to go!
> >
> >>
> >> We can discuss strategy to harmonize mTHP, zswap (with writeback) as
> >> we go along.
> >
> > Sounds great :)
> >
> >>
> >> BTW, I think we're not cc-ing Chengming? Is the get_maintainers script
> >> not working properly... Let me manually add him in - please include
> >> him in future submission and responses, as he is also a zswap reviewer
> >> :)
> >
> > I think when I ran get_maintainers.pl, I was in v6.10. For sure, will include
> > Chengming in future submissions and responses :)
> 
> Maybe a little late for the party, will take a look ASAP.
> It's an interesting and great work.

Thanks! Appreciate your code review and suggestions to improve
the patchset.

Thanks,
Kanchana

> 
> Thanks!
> 
> >
> >>
> >> Also cc-ing Usama who is interested in this work.
> >
> > Sounds great.
> >
> > Thanks,
> > Kanchana
> >
> >>
> >>>
> >>> [1] https://lore.kernel.org/all/20240821074541.516249-3-
> >> hanchuanhua@...o.com/T/#u
> >>>
> >>> Thanks,
> >>> Kanchana

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ