[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110119120319.GA2232@cmpxchg.org>
Date: Wed, 19 Jan 2011 13:03:19 +0100
From: Johannes Weiner <hannes@...xchg.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Balbir Singh <balbir@...ux.vnet.ibm.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: [patch rfc] memcg: correctly order reading PCG_USED and
pc->mem_cgroup
The placement of the read-side barrier is confused: the writer first
sets pc->mem_cgroup, then PCG_USED. The read-side barrier has to be
between testing PCG_USED and reading pc->mem_cgroup.
Signed-off-by: Johannes Weiner <hannes@...xchg.org>
---
mm/memcontrol.c | 27 +++++++++------------------
1 files changed, 9 insertions(+), 18 deletions(-)
I am a bit dumbfounded as to why this has never had any impact. I see
two scenarios where charging can race with LRU operations:
One is shmem pages on swapoff. They are on the LRU when charged as
page cache, which could race with isolation/putback. This seems
sufficiently rare.
The other case is a swap cache page being charged while somebody else
had it isolated. mem_cgroup_lru_del_before_commit_swapcache() would
see the page isolated and skip it. The commit then has to race with
putback, which could see PCG_USED but not pc->mem_cgroup, and crash
with a NULL pointer dereference. This does sound a bit more likely.
Any idea? Am I missing something?
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5b562b3..db76ef7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -836,13 +836,12 @@ void mem_cgroup_rotate_lru_list(struct page *page, enum lru_list lru)
return;
pc = lookup_page_cgroup(page);
- /*
- * Used bit is set without atomic ops but after smp_wmb().
- * For making pc->mem_cgroup visible, insert smp_rmb() here.
- */
- smp_rmb();
/* unused or root page is not rotated. */
- if (!PageCgroupUsed(pc) || mem_cgroup_is_root(pc->mem_cgroup))
+ if (!PageCgroupUsed(pc))
+ return;
+ /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
+ smp_rmb();
+ if (mem_cgroup_is_root(pc->mem_cgroup))
return;
mz = page_cgroup_zoneinfo(pc);
list_move(&pc->lru, &mz->lists[lru]);
@@ -857,14 +856,10 @@ void mem_cgroup_add_lru_list(struct page *page, enum lru_list lru)
return;
pc = lookup_page_cgroup(page);
VM_BUG_ON(PageCgroupAcctLRU(pc));
- /*
- * Used bit is set without atomic ops but after smp_wmb().
- * For making pc->mem_cgroup visible, insert smp_rmb() here.
- */
- smp_rmb();
if (!PageCgroupUsed(pc))
return;
-
+ /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
+ smp_rmb();
mz = page_cgroup_zoneinfo(pc);
/* huge page split is done under lru_lock. so, we have no races. */
MEM_CGROUP_ZSTAT(mz, lru) += 1 << compound_order(page);
@@ -1031,14 +1026,10 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page)
return NULL;
pc = lookup_page_cgroup(page);
- /*
- * Used bit is set without atomic ops but after smp_wmb().
- * For making pc->mem_cgroup visible, insert smp_rmb() here.
- */
- smp_rmb();
if (!PageCgroupUsed(pc))
return NULL;
-
+ /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
+ smp_rmb();
mz = page_cgroup_zoneinfo(pc);
if (!mz)
return NULL;
--
1.7.3.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists