linux-kernel - Re: [PATCH v5 3/6] zswap: make shrinking memcg-aware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkY+qdYytVKUjdgPypZthWA57gVKuEtjowuVPMpcOmpdLQ@mail.gmail.com>
Date:   Mon, 6 Nov 2023 12:57:44 -0800
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Nhat Pham <nphamcs@...il.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org,
        cerasuolodomenico@...il.com, sjenning@...hat.com,
        ddstreet@...e.org, vitaly.wool@...sulko.com, mhocko@...nel.org,
        roman.gushchin@...ux.dev, shakeelb@...gle.com,
        muchun.song@...ux.dev, chrisl@...nel.org, linux-mm@...ck.org,
        kernel-team@...a.com, linux-kernel@...r.kernel.org,
        cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kselftest@...r.kernel.org, shuah@...nel.org
Subject: Re: [PATCH v5 3/6] zswap: make shrinking memcg-aware

> >
> > This lock is only needed to synchronize updating pool->next_shrink,
> > right? Can we just use atomic operations instead? (e.g. cmpxchg()).
>
> I'm not entirely sure. I think in the pool destroy path, we have to also
> put the next_shrink memcg, so there's that.

We can use xchg() to replace it with NULL, then put the memcg ref, no?

We can also just hold zswap_pools_lock while shrinking the memcg
perhaps? It's not a contended lock anyway. It just feels weird to add
a spinlock to protect one pointer.

>
> >
> > > +               if (pool->next_shrink == memcg)
> > > +                       pool->next_shrink =
> > > +                               mem_cgroup_iter(NULL, pool->next_shrink, NULL, true);
> > > +               spin_unlock(&pool->next_shrink_lock);
> > > +       }
> > > +       spin_unlock(&zswap_pools_lock);
> > > +}
> > > +
> > >  /*********************************
> > >  * zswap entry functions
> > >  **********************************/
> > >  static struct kmem_cache *zswap_entry_cache;
> > >
> > > -static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp)
> > > +static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp, int nid)
> > >  {
> > >         struct zswap_entry *entry;
> > > -       entry = kmem_cache_alloc(zswap_entry_cache, gfp);
> > > +       entry = kmem_cache_alloc_node(zswap_entry_cache, gfp, nid);
> > >         if (!entry)
> > >                 return NULL;
> > >         entry->refcount = 1;
> > [..]
> > > @@ -1233,15 +1369,15 @@ bool zswap_store(struct folio *folio)
> > >                 zswap_invalidate_entry(tree, dupentry);
> > >         }
> > >         spin_unlock(&tree->lock);
> > > -
> > > -       /*
> > > -        * XXX: zswap reclaim does not work with cgroups yet. Without a
> > > -        * cgroup-aware entry LRU, we will push out entries system-wide based on
> > > -        * local cgroup limits.
> > > -        */
> > >         objcg = get_obj_cgroup_from_folio(folio);
> > > -       if (objcg && !obj_cgroup_may_zswap(objcg))
> > > -               goto reject;
> > > +       if (objcg && !obj_cgroup_may_zswap(objcg)) {
> > > +               memcg = get_mem_cgroup_from_objcg(objcg);
> > > +               if (shrink_memcg(memcg)) {
> > > +                       mem_cgroup_put(memcg);
> > > +                       goto reject;
> > > +               }
> > > +               mem_cgroup_put(memcg);
> >
> > Can we just use RCU here as well? (same around memcg_list_lru_alloc()
> > call below).
>
> For memcg_list_lru_alloc(): there's potentially sleeping in that piece of
> code I believe? I believe at the very least we'll have to use this gfp_t
> flag for it to be rcu-safe:
>
> GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN
> not sure the
>
> Same go for this particular place IIRC - there's some sleeping done
> in zswap_writeback_entry(), correct?

Ah right, I missed this. My bad.