[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191121205631.GA487872@cmpxchg.org>
Date: Thu, 21 Nov 2019 15:56:31 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Shakeel Butt <shakeelb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Alex Shi <alex.shi@...ux.alibaba.com>,
Roman Gushchin <guro@...com>, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-team@...com
Subject: Re: [PATCH] mm: fix unsafe page -> lruvec lookups with cgroup charge
migration
On Wed, Nov 20, 2019 at 07:15:27PM -0800, Hugh Dickins wrote:
> It like the way you've rearranged isolate_lru_page() there, but I
> don't think it amounts to more than a cleanup. Very good thinking
> about the odd "lruvec->pgdat = pgdat" case tucked away inside
> mem_cgroup_page_lruvec(), but actually, what harm does it do, if
> mem_cgroup_move_account() changes page->mem_cgroup concurrently?
>
> You say use-after-free, but we have spin_lock_irq here, and the
> struct mem_cgroup (and its lruvecs) cannot be freed until an RCU
> grace period expires, which we rely upon in many places, and which
> cannot happen until after the spin_unlock_irq.
You are correct, I missed the rcu locking implied by the
spinlock. With this, the justification for this patch is wrong.
But all of this is way too fragile and error-prone for my taste. We're
looking up a page's lruvec in a scope that does not promise at all
that the lruvec will be the page's. Luckily we currently don't touch
the lruvec outside of the PageLRU branch, but this subtlety is
entirely non-obvious from the code.
I will put more thought into this. Let's scrap this patch for now.
Powered by blists - more mailing lists