[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20141022180527.GA18998@phnom.home.cmpxchg.org>
Date: Wed, 22 Oct 2014 14:05:27 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Michal Hocko <mhocko@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch] mm: memcontrol: fix missed end-writeback accounting
On Wed, Oct 22, 2014 at 06:30:51PM +0200, Michal Hocko wrote:
> On Tue 21-10-14 14:19:10, Johannes Weiner wrote:
> > 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") changed page
> > migration to uncharge the old page right away. The page is locked,
> > unmapped, truncated, and off the LRU. But it could race with a
> > finishing writeback, which then doesn't get unaccounted properly:
> >
> > test_clear_page_writeback() migration
> > acquire pc->mem_cgroup->move_lock
> > wait_on_page_writeback()
> > TestClearPageWriteback()
> > mem_cgroup_migrate()
> > clear PCG_USED
> > if (PageCgroupUsed(pc))
> > decrease memcg pages under writeback
> > release pc->mem_cgroup->move_lock
> >
> > One solution for this would be to simply remove the PageCgroupUsed()
> > check, as RCU protects the memcg anyway.
> >
> > However, it's more robust to acknowledge that migration is really
> > modifying the charge state of alive pages in this case, and so it
> > should participate in the protocol specifically designed for this.
>
> It's been a long day so I might be missing something really obvious
> here. But how can move_lock help here when the fast path (no task
> migration is going on) takes only RCU read lock?
Argh, I actually noticed this issue while working on the page stat
simplification and thought I could break out a more isolated fix. But
you are right, that won't be enough, and I can't possibly put a RCU
grace period in mem_cgroup_migration().
I also just realized that we can't remove the PageCgroupUsed() check
when updating the page stat, either, because the "fast path" start of
the transaction does not verify the memcg for us - we can't tell
whether it's gone stale before or during the transaction. Grrr.
Andrew, please scratch this patch and the next 4-part series that
reworks the page stat updates. I'll send a reduced version of it
that's marked for 3.17-stable.
Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists