[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140519140248.GD3017@dhcp22.suse.cz>
Date: Mon, 19 May 2014 16:02:48 +0200
From: Michal Hocko <mhocko@...e.cz>
To: Greg Thelen <gthelen@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Tejun Heo <tj@...nel.org>, Hugh Dickins <hughd@...gle.com>,
LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: [PATCH] memcg: deprecate memory.force_empty knob
On Fri 16-05-14 15:00:16, Greg Thelen wrote:
> On Tue, May 13 2014, Michal Hocko <mhocko@...e.cz> wrote:
[...]
> > If somebody really cares because reparented pages, which would be
> > dropped otherwise, push out more important ones then we should fix the
> > reparenting code and put pages to the tail.
>
> I should mention a case where I've needed to use memory.force_empty: to
> synchronously flush stats from child to parent. Without force_empty
> memory.stat is temporarily inconsistent until async css_offline
> reparents charges. Here is an example on v3.14 showing that
> parent/memory.stat contents are in-flux immediately after rmdir of
> parent/child.
OK, it is true that the delayed offlining makes this little bit
complicated because there is no direct user visible relation between
rmdir and css_offline.
> $ cat /test
> #!/bin/bash
>
> # Create parent and child. Add some non-reclaimable anon rss to child,
> # then move running task to parent.
> mkdir p p/c
> (echo $BASHPID > p/c/cgroup.procs && exec sleep 1d) &
> pid=$!
> sleep 1
> echo $pid > p/cgroup.procs
>
> grep 'rss ' {p,p/c}/memory.stat
> if [[ $1 == force ]]; then
> echo 1 > p/c/memory.force_empty
> fi
> rmdir p/c
>
> echo 'For a small time the p/c memory has not been reparented to p.'
> grep 'rss ' {p,p/c}/memory.stat
>
> sleep 1
> echo 'After waiting all memory has been reparented'
> grep 'rss ' {p,p/c}/memory.stat
>
> kill $pid
> rmdir p
>
>
> -- First, demonstrate that just rmdir, without memory.force_empty,
> temporarily hides reparented child memory stats.
>
> $ /test
> p/memory.stat:rss 0
> p/memory.stat:total_rss 69632
> p/c/memory.stat:rss 69632
> p/c/memory.stat:total_rss 69632
> For a small time the p/c memory has not been reparented to p.
> p/memory.stat:rss 0
> p/memory.stat:total_rss 0
OK, this is a bug. Our iterators skip the children because css_tryget
fails on it but css_offline still not done. This is fixable, though,
and force_empty is just a workaround so I wouldn't see this as a proper
justification to keep it alive.
One possible way to fix this is to iterate children even when css_tryget
fails for them if they haven't finished css_offline yet.
There are some changes in the cgroups core which should make this easier
and Johannes claimed he has some work in that area.
Anyway this is a useful testcase. Thanks Greg!
> grep: p/c/memory.stat: No such file or directory
> After waiting all memory has been reparented
> p/memory.stat:rss 69632
> p/memory.stat:total_rss 69632
> grep: p/c/memory.stat: No such file or directory
> /test: Terminated ( echo $BASHPID > p/c/cgroup.procs && exec sleep 1d )
>
> -- Demonstrate that using memory.force_empty before rmdir, behaves more
> sensibly. Stats for reparented child memory are not hidden.
>
> $ /test force
> p/memory.stat:rss 0
> p/memory.stat:total_rss 69632
> p/c/memory.stat:rss 69632
> p/c/memory.stat:total_rss 69632
> For a small time the p/c memory has not been reparented to p.
> p/memory.stat:rss 69632
> p/memory.stat:total_rss 69632
> grep: p/c/memory.stat: No such file or directory
> After waiting all memory has been reparented
> p/memory.stat:rss 69632
> p/memory.stat:total_rss 69632
> grep: p/c/memory.stat: No such file or directory
> /test: Terminated ( echo $BASHPID > p/c/cgroup.procs && exec sleep 1d )
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists