lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkZV9NaPUzjZYQAx=D-vDEXZH46=Dn5zEuMHnsybT+HLAw@mail.gmail.com>
Date:   Tue, 31 Oct 2023 18:32:57 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Nhat Pham <nphamcs@...il.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org,
        cerasuolodomenico@...il.com, sjenning@...hat.com,
        ddstreet@...e.org, vitaly.wool@...sulko.com, mhocko@...nel.org,
        roman.gushchin@...ux.dev, shakeelb@...gle.com,
        muchun.song@...ux.dev, chrisl@...nel.org, linux-mm@...ck.org,
        kernel-team@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/5] zswap: make shrinking memcg-aware

On Tue, Oct 31, 2023 at 6:26 PM Nhat Pham <nphamcs@...il.com> wrote:
>
> cc-ing Johannes, Roman, Shakeel, Muchun since you all know much more
> about memory controller + list_lru reparenting logic than me.
>
> There seems to be a race between memcg offlining and zswap’s
> cgroup-aware LRU implementation:
>
> CPU0                            CPU1
> zswap_lru_add()                 mem_cgroup_css_offline()
>     get_mem_cgroup_from_objcg()
>                                     memcg_offline_kmem()
>                                         memcg_reparent_objcgs()
>                                         memcg_reparent_list_lrus()
>                                             memcg_reparent_list_lru()
>                                                 memcg_reparent_list_lru_node()
>     list_lru_add()
>                                                 memcg_list_lru_free()
>
>
> Essentially: on CPU0, zswap gets the memcg from the entry's objcg
> (before the objcgs are reparented). Then it performs list_lru_add()
> after the list_lru entries reparenting (memcg_reparent_list_lru_node())
> step. If the list_lru of the memcg being offlined has not been freed
> (i.e before the memcg_list_lru_free() call), then the list_lru_add()
> call would succeed - but the list will be freed soon after. The new
> zswap entry as a result will not be subjected to future reclaim
> attempt. IOW, this list_lru_add() call is effectively swallowed. And
> worse, there might be a crash when we invalidate the zswap_entry in the
> future (which will perform a list_lru removal).
>
> Within get_mem_cgroup_from_objcg(), none of the following seem
> sufficient to prevent this race:
>
>     1. Perform the objcg-to-memcg lookup inside a rcu_read_lock()
>     section.
>     2. Checking if the memcg is freed yet (with css_tryget()) (what
>     we're currently doing in this patch series).
>     3. Checking if the memcg is still online (with css_tryget_online())
>     The memcg can still be offlined down the line.
>
>
> I've discussed this privately with Johannes, and it seems like the
> cleanest solution here is to move the reparenting logic down to release
> stage. That way, when get_mem_cgroup_from_objcg() returns,
> zswap_lru_add() is given an memcg that is reparenting-safe (until we
> drop the obtained reference).

The objcgs hold refs to the memcg, which are dropped during
reparenting. How can we do reparenting in the release stage, which
IIUC happens after all refs are dropped?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ