linux-kernel - Re: [PATCH v5 3/6] zswap: make shrinking memcg-aware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKEwX=Outf_hz_4UrzqKTbxxQD7y-Wm1cv9tOWC5J3V1ZmSiaA@mail.gmail.com>
Date:   Mon, 6 Nov 2023 15:25:05 -0800
From:   Nhat Pham <nphamcs@...il.com>
To:     Yosry Ahmed <yosryahmed@...gle.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org,
        cerasuolodomenico@...il.com, sjenning@...hat.com,
        ddstreet@...e.org, vitaly.wool@...sulko.com, mhocko@...nel.org,
        roman.gushchin@...ux.dev, shakeelb@...gle.com,
        muchun.song@...ux.dev, chrisl@...nel.org, linux-mm@...ck.org,
        kernel-team@...a.com, linux-kernel@...r.kernel.org,
        cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kselftest@...r.kernel.org, shuah@...nel.org
Subject: Re: [PATCH v5 3/6] zswap: make shrinking memcg-aware

On Mon, Nov 6, 2023 at 12:58 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
>
> > >
> > > This lock is only needed to synchronize updating pool->next_shrink,
> > > right? Can we just use atomic operations instead? (e.g. cmpxchg()).
> >
> > I'm not entirely sure. I think in the pool destroy path, we have to also
> > put the next_shrink memcg, so there's that.
>
> We can use xchg() to replace it with NULL, then put the memcg ref, no?
>
> We can also just hold zswap_pools_lock while shrinking the memcg
> perhaps? It's not a contended lock anyway. It just feels weird to add
> a spinlock to protect one pointer.

Ah this sounds good to me I guess. I'm not opposed to this simplification
of the concurrency scheme.

>
> >
> > >
> > > > +               if (pool->next_shrink == memcg)
> > > > +                       pool->next_shrink =
> > > > +                               mem_cgroup_iter(NULL, pool->next_shrink, NULL, true);
> > > > +               spin_unlock(&pool->next_shrink_lock);
> > > > +       }
> > > > +       spin_unlock(&zswap_pools_lock);
> > > > +}
> > > > +
> > > >  /*********************************
> > > >  * zswap entry functions
> > > >  **********************************/
> > > >  static struct kmem_cache *zswap_entry_cache;
> > > >
> > > > -static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp)
> > > > +static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp, int nid)
> > > >  {
> > > >         struct zswap_entry *entry;
> > > > -       entry = kmem_cache_alloc(zswap_entry_cache, gfp);
> > > > +       entry = kmem_cache_alloc_node(zswap_entry_cache, gfp, nid);
> > > >         if (!entry)
> > > >                 return NULL;
> > > >         entry->refcount = 1;
> > > [..]
> > > > @@ -1233,15 +1369,15 @@ bool zswap_store(struct folio *folio)
> > > >                 zswap_invalidate_entry(tree, dupentry);
> > > >         }
> > > >         spin_unlock(&tree->lock);
> > > > -
> > > > -       /*
> > > > -        * XXX: zswap reclaim does not work with cgroups yet. Without a
> > > > -        * cgroup-aware entry LRU, we will push out entries system-wide based on
> > > > -        * local cgroup limits.
> > > > -        */
> > > >         objcg = get_obj_cgroup_from_folio(folio);
> > > > -       if (objcg && !obj_cgroup_may_zswap(objcg))
> > > > -               goto reject;
> > > > +       if (objcg && !obj_cgroup_may_zswap(objcg)) {
> > > > +               memcg = get_mem_cgroup_from_objcg(objcg);
> > > > +               if (shrink_memcg(memcg)) {
> > > > +                       mem_cgroup_put(memcg);
> > > > +                       goto reject;
> > > > +               }
> > > > +               mem_cgroup_put(memcg);
> > >
> > > Can we just use RCU here as well? (same around memcg_list_lru_alloc()
> > > call below).
> >
> > For memcg_list_lru_alloc(): there's potentially sleeping in that piece of
> > code I believe? I believe at the very least we'll have to use this gfp_t
> > flag for it to be rcu-safe:
> >
> > GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN
> > not sure the
> >
> > Same go for this particular place IIRC - there's some sleeping done
> > in zswap_writeback_entry(), correct?
>
> Ah right, I missed this. My bad.