[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkZFWMx5GggPPMzcHJ4NiDPC0pbUgwUSQJVXgAYFozFY+Q@mail.gmail.com>
Date: Fri, 18 Aug 2023 15:26:04 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Yu Zhao <yuzhao@...gle.com>, Nhat Pham <nphamcs@...il.com>,
akpm@...ux-foundation.org, kernel-team@...a.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2] workingset: ensure memcg is valid for recency check
On Fri, Aug 18, 2023 at 3:19 PM Johannes Weiner <hannes@...xchg.org> wrote:
>
> On Fri, Aug 18, 2023 at 11:44:45AM -0700, Yosry Ahmed wrote:
> > On Fri, Aug 18, 2023 at 11:35 AM Johannes Weiner <hannes@...xchg.org> wrote:
> > >
> > > On Fri, Aug 18, 2023 at 10:45:56AM -0700, Yosry Ahmed wrote:
> > > > On Fri, Aug 18, 2023 at 10:35 AM Johannes Weiner <hannes@...xchg.org> wrote:
> > > > > On Fri, Aug 18, 2023 at 07:56:37AM -0700, Yosry Ahmed wrote:
> > > > > > If this happens it seems possible for this to happen:
> > > > > >
> > > > > > cpu #1 cpu#2
> > > > > > css_put()
> > > > > > /* css_free_rwork_fn is queued */
> > > > > > rcu_read_lock()
> > > > > > mem_cgroup_from_id()
> > > > > > mem_cgroup_id_remove()
> > > > > > /* access memcg */
> > > > >
> > > > > I don't quite see how that'd possible. IDR uses rcu_assign_pointer()
> > > > > during deletion, which inserts the necessary barriering. My
> > > > > understanding is that this should always be safe:
> > > > >
> > > > > rcu_read_lock() (writer serialization, in this case ref count == 0)
> > > > > foo = idr_find(x) idr_remove(x)
> > > > > if (foo) kfree_rcu(foo)
> > > > > LOAD(foo->bar)
> > > > > rcu_read_unlock()
> > > >
> > > > How does a barrier inside IDR removal protect against the memcg being
> > > > freed here though?
> > > >
> > > > If css_put() is executed out-of-order before mem_cgroup_id_remove(),
> > > > the memcg can be freed even before mem_cgroup_id_remove() is called,
> > > > right?
> > >
> > > css_put() can start earlier, but it's not allowed to reorder the rcu
> > > callback that frees past the rcu_assign_pointer() in idr_remove().
> > >
> > > This is what RCU and its access primitives guarantees. It ensures that
> > > after "unpublishing" the pointer, all concurrent RCU-protected
> > > accesses to the object have finished, and the memory can be freed.
> >
> > I am not sure I understand, this is the scenario I mean:
> >
> > cpu#1 cpu#2 cpu#3
> > css_put()
> > /* schedule free */
> > rcu_read_lock()
> > idr_remove()
> > mem_cgroup_from_id()
> >
> > /* free memcg */
> > /* use memcg */
>
> idr_remove() cannot be re-ordered after scheduling the free. Think
> about it, this is the common rcu-freeing pattern:
>
> rcu_assign_pointer(p, NULL);
> call_rcu(rh, free_pointee);
>
> on the write side, and:
>
> rcu_read_lock();
> pointee = rcu_dereference(p);
> if (pointee)
> do_stuff(pointee);
> rcu_read_unlock();
>
> on the read side.
>
> In our case, the rcu_assign_pointer() is in idr_remove(). And the
> rcu_dereference() is in mem_cgroup_from_id() -> idr_find() ->
> radix_tree_lookup() -> radix_tree_descend().
>
> So if we find the memcg in the idr under rcu lock, the cgroup rcu work
> is guaranteed to not run until the lock is dropped. If we don't find
> it, it may or may not have already run.
Yeah I missed the implicit barrier there, thanks for bearing with me.
I think Shakeel might have found the actual problem here (see his
response).
Powered by blists - more mailing lists