linux-kernel - Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YN7BSgvSJXzzbIBj@google.com>
Date:   Fri, 2 Jul 2021 00:33:30 -0700
From:   Minchan Kim <minchan@...nel.org>
To:     Zhaoyang Huang <huangzhaoyang@...il.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
        mm-commits@...r.kernel.org,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        torvalds@...ux-foundation.org,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>
Subject: Re: [patch 108/192] mm: zram: amend SLAB_RECLAIM_ACCOUNT on
 zspage_cachep

On Fri, Jul 02, 2021 at 02:20:42PM +0800, Zhaoyang Huang wrote:
> On Fri, Jul 2, 2021 at 1:47 PM Minchan Kim <minchan@...nel.org> wrote:
> >
> > On Fri, Jul 02, 2021 at 10:45:09AM +0800, Zhaoyang Huang wrote:
> > > On Thu, Jul 1, 2021 at 10:56 PM Minchan Kim <minchan@...nel.org> wrote:
> > > >
> > > > On Wed, Jun 30, 2021 at 06:52:58PM -0700, Andrew Morton wrote:
> > > > > From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> > > > > Subject: mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep
> > > > >
> > > > > Zspage_cachep is found be merged with other kmem cache during test, which
> > > > > is not good for debug things (zs_pool->zspage_cachep present to be another
> > > > > kmem cache in memory dumpfile).  It is also neccessary to do so as
> > > > > shrinker has been registered for zspage.
> > > > >
> > > > > Amending this flag can help kernel to calculate SLAB_RECLAIMBLE correctly.
> > > > >
> > > > > Link: https://lkml.kernel.org/r/1623137297-29685-1-git-send-email-huangzhaoyang@gmail.com
> > > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> > > > > Cc: Minchan Kim <minchan@...nel.org>
> > > > > Cc: Sergey Senozhatsky <senozhatsky@...omium.org>
> > > > > Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> > > >
> > > > Sorry for the late. I don't think this is correct.
> > > >
> > > > It's true "struct zspage" can be freed by zsmalloc's compaction registerred
> > > > by slab shrinker so tempted to make it SLAB_RECLAIM_ACCOUNT. However, it's
> > > > quite limited to work only when objects in the zspage are heavily fragmented.
> > > > Once the compaction is done, zspage are never discardable until objects are
> > > > fragmented again. It means it could hurt other reclaimable slab page reclaiming
> > > > since the zspage slab object pins the page.
> > > IMHO, kmem cache's reclaiming is NOT affected by SLAB_RECLAIM_ACCOUNT
> > > . This flag just affects kmem cache merge[1], the slab page's migrate
> > > type[2] and the page's statistics. Actually, zspage's cache DO merged
> > > with others even without SLAB_RECLAIM_ACCOUNT currently, which maybe
> > > cause zspage's object will NEVER be discarded.(SLAB_MERGE_SAME
> > > introduce confusions as people believe the cache will merge with
> > > others when it set and vice versa)
> > >
> > > [1]
> > >  struct kmem_cache *find_mergeable(size_t size, size_t align, unsigned
> > > long flags, const char *name, void (*ctor)(void *))
> > > ...
> > >     if ((flags & SLAB_MERGE_SAME) != (s->flags & SLAB_MERGE_SAME))
> > >      continue;
> > >
> > > [2]
> > > if (s->flags & SLAB_RECLAIM_ACCOUNT)
> > >     s->allocflags |= __GFP_RECLAIMABLE;
> >
> > That's the point here. With SLAB_RECLAIM_ACCOUNT, page allocator
> > try to allocate pages from MIGRATE_RECLAIMABLE with belief those
> > objects are easily reclaimable. Say a page has object A, B, C, D
> > and E. A-D are easily reclaimable but E is hard. What happens is
> > VM couldn't reclaim the page in the end due to E even though it
> > already reclaimed A-D. And the such fragmenation could be spread
> > out entire MIGRATE_RECLAIMABLE pageblocks over time.
> > That's why I'd like to put zspage into MIGRATE_UNMOVALBE from the
> > beginning since I don't think it's easily reclaimble once compaction
> > is done.
> The slab page could fallback to any migrate type even allocating with

It's true but it couldn't be justication to allocate objects from any
migration type. We should try to select right type. Please see below.

> __GFP_RECLAIMABLE, and there is only one page per slab within zspage's
> cache, which will not be affected by compaction, so I think that
> doesn't make sense.

You shouldn't rely on how many pages the slab has since it's internal
implemenation and zspage size also could be changed in the future.
And please think about external fragmentaion as well as internal one.

What we want to try with allocation type is to group similar lifetime
objects together in a pageblock group to help external fragmentation
for high-order allocation. Think what happens if the unreclaimable
object is located in a reclaimable pageblock. The block couldn't be
merged into high-order page in the end so it causes more compaction
and smaller available high-order pages in the system.