[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140925142312.GE11080@dhcp22.suse.cz>
Date: Thu, 25 Sep 2014 16:23:12 +0200
From: Michal Hocko <mhocko@...e.cz>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Tejun Heo <tj@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
Peter Zijlstra <peterz@...radead.org>, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch] mm: memcontrol: do not iterate uninitialized memcgs
On Thu 25-09-14 09:43:42, Johannes Weiner wrote:
[...]
> From 1cd659f42f399adc58522d478f54587c8c4dd5cc Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@...xchg.org>
> Date: Wed, 24 Sep 2014 22:00:20 -0400
> Subject: [patch] mm: memcontrol: do not iterate uninitialized memcgs
>
> The cgroup iterators yield css objects that have not yet gone through
> css_online(), but they are not complete memcgs at this point and so
> the memcg iterators should not return them. d8ad30559715 ("mm/memcg:
> iteration skip memcgs not yet fully initialized") set out to implement
> exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
> not meet the ordering requirements for memcg, and so the iterator may
> skip over initialized groups, or return partially initialized memcgs.
>
> The cgroup core can not reasonably provide a clear answer on whether
> the object around the css has been fully initialized, as that depends
> on controller-specific locking and lifetime rules. Thus, introduce a
> memcg-specific flag that is set after the memcg has been initialized
> in css_online(), and read before mem_cgroup_iter() callers access the
> memcg members.
>
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>
> Cc: <stable@...r.kernel.org> [3.12+]
I am not an expert (obviously) on memory barriers but from
Documentation/memory-barriers.txt, my understanding is that
smp_load_acquire and smp_store_release is exactly what we need here.
"
However, after an ACQUIRE on a given variable, all memory accesses
preceding any prior RELEASE on that same variable are guaranteed to be
visible.
"
Acked-by: Michal Hocko <mhocko@...e.cz>
Stable backport would be trickier because ACQUIRE/RELEASE were
introduced later but smp_mb() should be safe replacement.
Thanks!
> ---
> mm/memcontrol.c | 36 +++++++++++++++++++++++++++++++-----
> 1 file changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 306b6470784c..23976fd885fd 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -292,6 +292,9 @@ struct mem_cgroup {
> /* vmpressure notifications */
> struct vmpressure vmpressure;
>
> + /* css_online() has been completed */
> + int initialized;
> +
> /*
> * the counter to account for mem+swap usage.
> */
> @@ -1090,10 +1093,21 @@ skip_node:
> * skipping css reference should be safe.
> */
> if (next_css) {
> - if ((next_css == &root->css) ||
> - ((next_css->flags & CSS_ONLINE) &&
> - css_tryget_online(next_css)))
> - return mem_cgroup_from_css(next_css);
> + struct mem_cgroup *memcg = mem_cgroup_from_css(next_css);
> +
> + if (next_css == &root->css)
> + return memcg;
> +
> + if (css_tryget_online(next_css)) {
> + /*
> + * Make sure the memcg is initialized:
> + * mem_cgroup_css_online() orders the the
> + * initialization against setting the flag.
> + */
> + if (smp_load_acquire(&memcg->initialized))
> + return memcg;
> + css_put(next_css);
> + }
>
> prev_css = next_css;
> goto skip_node;
> @@ -5413,6 +5427,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> struct mem_cgroup *parent = mem_cgroup_from_css(css->parent);
> + int ret;
>
> if (css->id > MEM_CGROUP_ID_MAX)
> return -ENOSPC;
> @@ -5449,7 +5464,18 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
> }
> mutex_unlock(&memcg_create_mutex);
>
> - return memcg_init_kmem(memcg, &memory_cgrp_subsys);
> + ret = memcg_init_kmem(memcg, &memory_cgrp_subsys);
> + if (ret)
> + return ret;
> +
> + /*
> + * Make sure the memcg is initialized: mem_cgroup_iter()
> + * orders reading memcg->initialized against its callers
> + * reading the memcg members.
> + */
> + smp_store_release(&memcg->initialized, 1);
> +
> + return 0;
> }
>
> /*
> --
> 2.1.0
>
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists