[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YJlfHt8vmKURDX5z@carbon.DHCP.thefacebook.com>
Date: Mon, 10 May 2021 09:28:14 -0700
From: Roman Gushchin <guro@...com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: Muchun Song <songmuchun@...edance.com>, <hannes@...xchg.org>,
<mhocko@...nel.org>, <shakeelb@...gle.com>,
<vdavydov.dev@...il.com>, <linux-kernel@...r.kernel.org>,
<linux-mm@...ck.org>, <duanxiongchun@...edance.com>,
<fam.zheng@...edance.com>, <zhengqi.arch@...edance.com>
Subject: Re: [PATCH v2] mm: memcontrol: fix root_mem_cgroup charging
On Sun, May 09, 2021 at 05:54:03PM -0700, Andrew Morton wrote:
> On Sun, 25 Apr 2021 15:54:10 +0800 Muchun Song <songmuchun@...edance.com> wrote:
>
> > The below scenario can cause the page counters of the root_mem_cgroup
> > to be out of balance.
> >
> > CPU0: CPU1:
> >
> > objcg = get_obj_cgroup_from_current()
> > obj_cgroup_charge_pages(objcg)
> > memcg_reparent_objcgs()
> > // reparent to root_mem_cgroup
> > WRITE_ONCE(iter->memcg, parent)
> > // memcg == root_mem_cgroup
> > memcg = get_mem_cgroup_from_objcg(objcg)
> > // do not charge to the root_mem_cgroup
> > try_charge(memcg)
> >
> > obj_cgroup_uncharge_pages(objcg)
> > memcg = get_mem_cgroup_from_objcg(objcg)
> > // uncharge from the root_mem_cgroup
> > refill_stock(memcg)
> > drain_stock(memcg)
> > page_counter_uncharge(&memcg->memory)
> >
> > get_obj_cgroup_from_current() never returns a root_mem_cgroup's objcg,
> > so we never explicitly charge the root_mem_cgroup. And it's not
> > going to change. It's all about a race when we got an obj_cgroup
> > pointing at some non-root memcg, but before we were able to charge it,
> > the cgroup was gone, objcg was reparented to the root and so we're
> > skipping the charging. Then we store the objcg pointer and later use
> > to uncharge the root_mem_cgroup.
> >
> > This can cause the page counter to be less than the actual value.
> > Although we do not display the value (mem_cgroup_usage) so there
> > shouldn't be any actual problem, but there is a WARN_ON_ONCE in
> > the page_counter_cancel(). Who knows if it will trigger? So it
> > is better to fix it.
> >
>
> It sounds like Roman will be acking this, but some additional reviewer
> attention would be helpful, please.
>
The patch is technically correct, so I'm ok to ack it:
Acked-by: Roman Gushchin <guro@...com>
I remember Michal was looking into it, so he can probably add something here.
Thanks!
Powered by blists - more mailing lists