[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20110120104947.GK2232@cmpxchg.org>
Date: Thu, 20 Jan 2011 11:49:47 +0100
From: Johannes Weiner <hannes@...xchg.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Balbir Singh <balbir@...ux.vnet.ibm.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch rfc] memcg: correctly order reading PCG_USED and
pc->mem_cgroup
On Thu, Jan 20, 2011 at 10:06:54AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 19 Jan 2011 13:03:19 +0100
> Johannes Weiner <hannes@...xchg.org> wrote:
>
> > The placement of the read-side barrier is confused: the writer first
> > sets pc->mem_cgroup, then PCG_USED. The read-side barrier has to be
> > between testing PCG_USED and reading pc->mem_cgroup.
> >
> > Signed-off-by: Johannes Weiner <hannes@...xchg.org>
> > ---
> > mm/memcontrol.c | 27 +++++++++------------------
> > 1 files changed, 9 insertions(+), 18 deletions(-)
> >
> > I am a bit dumbfounded as to why this has never had any impact. I see
> > two scenarios where charging can race with LRU operations:
> >
> > One is shmem pages on swapoff. They are on the LRU when charged as
> > page cache, which could race with isolation/putback. This seems
> > sufficiently rare.
> >
> > The other case is a swap cache page being charged while somebody else
> > had it isolated. mem_cgroup_lru_del_before_commit_swapcache() would
> > see the page isolated and skip it. The commit then has to race with
> > putback, which could see PCG_USED but not pc->mem_cgroup, and crash
> > with a NULL pointer dereference. This does sound a bit more likely.
> >
> > Any idea? Am I missing something?
> >
>
> I think troubles happen only when PCG_USED bit was found but pc->mem_cgroup
> is NULL. Hmm.
Correct. Well, or get linked to the wrong LRU list and subsequently
prevent removal of both cgroups because it's impossible to empty them.
> set pc->mem_cgroup
> write_barrier
> set USED bit.
>
> read_barrier
> check USED bit
> access pc->mem_cgroup
>
> So, is there a case which only USED bit can be seen ?
That's what I am not quite sure about. As said, I think it can happen
when swap cache charging races with reclaim or migration.
When the two loads of the used bit and the memcg pointer get
reordered, it could observe a set PCG_USED and a stale/unset
pc->mem_cgroup.
For example:
swap minor fault: vmscan:
lookup_swap_cache()
unlock_page()
lock_page()
mem_cgroup_try_charge_swapin()
putback_lru_page()
mem_cgroup_add_lru_list()
p = pc->mem_cgroup
pc->mem_cgroup = FOO
smp_wmb()
pc->flags |= PCG_USED
f = pc->flags
if (!(f & PCG_USED))
return
*p /* bang */
> Anyway, your patch is right.
>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Thank you.
Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists