[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201215020420.GJ3913616@dread.disaster.area>
Date: Tue, 15 Dec 2020 13:04:20 +1100
From: Dave Chinner <david@...morbit.com>
To: Yang Shi <shy828301@...il.com>
Cc: guro@...com, ktkhai@...tuozzo.com, shakeelb@...gle.com,
hannes@...xchg.org, mhocko@...e.com, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [v2 PATCH 3/9] mm: vmscan: guarantee shrinker_slab_memcg() sees
valid shrinker_maps for online memcg
On Mon, Dec 14, 2020 at 02:37:16PM -0800, Yang Shi wrote:
> The shrink_slab_memcg() races with mem_cgroup_css_online(). A visibility of CSS_ONLINE flag
> in shrink_slab_memcg()->mem_cgroup_online() does not guarantee that we will see
> memcg->nodeinfo[nid]->shrinker_maps != NULL. This may occur because of processor reordering
> on !x86.
>
> This seems like the below case:
>
> CPU A CPU B
> store shrinker_map load CSS_ONLINE
> store CSS_ONLINE load shrinker_map
>
> So the memory ordering could be guaranteed by smp_wmb()/smp_rmb() pair.
>
> The memory barriers pair will guarantee the ordering between shrinker_deferred and CSS_ONLINE
> for the following patches as well.
This should not require memory barriers in the shrinker code.
The code that sets and checks the CSS_ONLINE flag should have the
memory barriers to ensure that anything that sees an online CSS will
see it completely set up.
That is, the functions online_css() that set the CSS_ONLINE needs
a memory barrier to ensure all previous writes are completed before
the CSS_ONLINE flag is set, and the function mem_cgroup_online()
needs a barrier to pair with that.
This is the same existence issue that the superblock shrinkers have
with the shrinkers being registered before the superblock is fully
set up. The SB_BORN flag on the sueprblock indicates the superblock
is now fully set up ("online" in CSS speak) and the registered
shrinker can run. Please see the smp_wmb() before we set SB_BORN in
vfs_get_tree(), and the big comment about the smp_rmb() -after- we
check SB_BORN in super_cache_count() to understand the details of
the data dependency between the flag and the structures being set up
that the barriers enforce.
IOWs, these memory barriers belong inside the cgroup code to
guarantee anything that sees an online cgroup will always see the
fully initialised cgroup structures. They do not belong in the
shrinker infrastructure...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists