lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4wuoWKSnzeJ-2Xoc=_du3ZL3Ms8s6K58w8En3_h8-q_ng@mail.gmail.com>
Date: Thu, 15 Aug 2024 11:24:49 +1200
From: Barry Song <21cnbao@...il.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>, 
	"hannes@...xchg.org" <hannes@...xchg.org>, "yosryahmed@...gle.com" <yosryahmed@...gle.com>, 
	"nphamcs@...il.com" <nphamcs@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>, 
	"Huang, Ying" <ying.huang@...el.com>, 
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "Zou, Nanhai" <nanhai.zou@...el.com>, 
	"Feghali, Wajdi K" <wajdi.k.feghali@...el.com>, "Gopal, Vinodh" <vinodh.gopal@...el.com>
Subject: Re: [RFC PATCH v1 2/4] mm: vmstat: Per mTHP-size zswap_store vmstat
 event counters.

On Thu, Aug 15, 2024 at 5:40 AM Sridhar, Kanchana P
<kanchana.p.sridhar@...el.com> wrote:
>
> Hi Barry,
>
> > -----Original Message-----
> > From: Barry Song <21cnbao@...il.com>
> > Sent: Wednesday, August 14, 2024 12:49 AM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> > Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> > hannes@...xchg.org; yosryahmed@...gle.com; nphamcs@...il.com;
> > ryan.roberts@....com; Huang, Ying <ying.huang@...el.com>; akpm@...ux-
> > foundation.org; Zou, Nanhai <nanhai.zou@...el.com>; Feghali, Wajdi K
> > <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>
> > Subject: Re: [RFC PATCH v1 2/4] mm: vmstat: Per mTHP-size zswap_store
> > vmstat event counters.
> >
> > On Wed, Aug 14, 2024 at 6:28 PM Kanchana P Sridhar
> > <kanchana.p.sridhar@...el.com> wrote:
> > >
> > > Added vmstat event counters per mTHP-size that can be used to account
> > > for folios of different sizes being successfully stored in ZSWAP.
> > >
> > > For this RFC, it is not clear if these zswpout counters should instead
> > > be added as part of the existing mTHP stats in
> > > /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats.
> > >
> > > The following is also a viable option, should it make better sense,
> > > for instance, as:
> > >
> > > /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/zswpout.
> > >
> > > If so, we would be able to distinguish between mTHP zswap and
> > > non-zswap swapouts through:
> > >
> > > /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/zswpout
> > >
> > > and
> > >
> > > /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/swpout
> > >
> > > respectively.
> > >
> > > Comments would be appreciated as to which approach is preferable.
> >
> > Even though swapout might go through zswap, from the perspective of
> > the mm core, it shouldn't be aware of that. Shouldn't zswpout be part
> > of swpout? Why are they separate? no matter if a mTHP has been
> > put in zswap, it has been swapped-out to mm-core? No?
>
> Thanks for the code review comments. This is a good point. I was keeping in
> mind the convention used by existing vmstat event counters that distinguish
> zswpout/zswpin from pswpout/pswpin events.
>
> If we want to keep the distinction in mTHP swapouts, would adding a
> separate MTHP_STAT_ZSWPOUT to "enum mthp_stat_item" be Ok?
>

I'm not entirely sure how important the zswpout counter is. To me, it doesn't
seem as critical as swpout and swpout_fallback, which are more useful for
system profiling. zswapout feels more like an internal detail related to
how the swap-out process is handled? If this is the case, we might not
need this per-size counter.

Otherwise, I believe sysfs is a better place to avoid all the chaos in vmstat
to handle various orders and sizes. So the question is, per-size zswpout
counter is really important or just for debugging purposes?

> In any case, it looks like all that would be needed is a call to
> count_mthp_stat(folio_order(folio), MTHP_STAT_[Z]SWPOUT) in the
> general case.
>
> I will make this change in v2, depending on whether or not the
> separation of zswpout vs. non-zswap swpout is recommended for
> mTHP.
>
> >
> >
> > >
> > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@...el.com>
> > > ---
> > >  include/linux/vm_event_item.h | 15 +++++++++++++++
> > >  mm/vmstat.c                   | 15 +++++++++++++++
> > >  2 files changed, 30 insertions(+)
> > >
> > > diff --git a/include/linux/vm_event_item.h
> > b/include/linux/vm_event_item.h
> > > index 747943bc8cc2..2451bcfcf05c 100644
> > > --- a/include/linux/vm_event_item.h
> > > +++ b/include/linux/vm_event_item.h
> > > @@ -114,6 +114,9 @@ enum vm_event_item { PGPGIN, PGPGOUT,
> > PSWPIN, PSWPOUT,
> > >                 THP_ZERO_PAGE_ALLOC,
> > >                 THP_ZERO_PAGE_ALLOC_FAILED,
> > >                 THP_SWPOUT,
> > > +#ifdef CONFIG_ZSWAP
> > > +               ZSWPOUT_PMD_THP_FOLIO,
> > > +#endif
> > >                 THP_SWPOUT_FALLBACK,
> > >  #endif
> > >  #ifdef CONFIG_MEMORY_BALLOON
> > > @@ -143,6 +146,18 @@ enum vm_event_item { PGPGIN, PGPGOUT,
> > PSWPIN, PSWPOUT,
> > >                 ZSWPIN,
> > >                 ZSWPOUT,
> > >                 ZSWPWB,
> > > +               ZSWPOUT_4KB_FOLIO,
> > > +#ifdef CONFIG_THP_SWAP
> > > +               mTHP_ZSWPOUT_8kB,
> > > +               mTHP_ZSWPOUT_16kB,
> > > +               mTHP_ZSWPOUT_32kB,
> > > +               mTHP_ZSWPOUT_64kB,
> > > +               mTHP_ZSWPOUT_128kB,
> > > +               mTHP_ZSWPOUT_256kB,
> > > +               mTHP_ZSWPOUT_512kB,
> > > +               mTHP_ZSWPOUT_1024kB,
> > > +               mTHP_ZSWPOUT_2048kB,
> > > +#endif
> >
> > This implementation hardcodes assumptions about the page size being 4KB,
> > but page sizes can vary, and so can the THP orders?
>
> Agreed, will address in v2.
>
> >
> > >  #endif
> > >  #ifdef CONFIG_X86
> > >                 DIRECT_MAP_LEVEL2_SPLIT,
> > > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > > index 8507c497218b..0e66c8b0c486 100644
> > > --- a/mm/vmstat.c
> > > +++ b/mm/vmstat.c
> > > @@ -1375,6 +1375,9 @@ const char * const vmstat_text[] = {
> > >         "thp_zero_page_alloc",
> > >         "thp_zero_page_alloc_failed",
> > >         "thp_swpout",
> > > +#ifdef CONFIG_ZSWAP
> > > +       "zswpout_pmd_thp_folio",
> > > +#endif
> > >         "thp_swpout_fallback",
> > >  #endif
> > >  #ifdef CONFIG_MEMORY_BALLOON
> > > @@ -1405,6 +1408,18 @@ const char * const vmstat_text[] = {
> > >         "zswpin",
> > >         "zswpout",
> > >         "zswpwb",
> > > +       "zswpout_4kb_folio",
> > > +#ifdef CONFIG_THP_SWAP
> > > +       "mthp_zswpout_8kb",
> > > +       "mthp_zswpout_16kb",
> > > +       "mthp_zswpout_32kb",
> > > +       "mthp_zswpout_64kb",
> > > +       "mthp_zswpout_128kb",
> > > +       "mthp_zswpout_256kb",
> > > +       "mthp_zswpout_512kb",
> > > +       "mthp_zswpout_1024kb",
> > > +       "mthp_zswpout_2048kb",
> > > +#endif
> >
> > The issue here is that the number of THP orders
> > can vary across different platforms.
>
> Agreed, will address in v2.
>
> Thanks,
> Kanchana
>
> >
> > >  #endif
> > >  #ifdef CONFIG_X86
> > >         "direct_map_level2_splits",
> > > --
> > > 2.27.0
> > >

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ