[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20110914095504.30fca5d0.kamezawa.hiroyu@jp.fujitsu.com>
Date: Wed, 14 Sep 2011 09:55:04 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Johannes Weiner <jweiner@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Balbir Singh <bsingharora@...il.com>,
Ying Han <yinghan@...gle.com>, Michal Hocko <mhocko@...e.cz>,
Greg Thelen <gthelen@...gle.com>,
Michel Lespinasse <walken@...gle.com>,
Rik van Riel <riel@...hat.com>,
Minchan Kim <minchan.kim@...il.com>,
Christoph Hellwig <hch@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan
generations
On Tue, 13 Sep 2011 13:03:01 +0200
Johannes Weiner <jweiner@...hat.com> wrote:
> On Tue, Sep 13, 2011 at 07:27:59PM +0900, KAMEZAWA Hiroyuki wrote:
> > On Mon, 12 Sep 2011 12:57:21 +0200
> > Johannes Weiner <jweiner@...hat.com> wrote:
> >
> > > Memory cgroup limit reclaim currently picks one memory cgroup out of
> > > the target hierarchy, remembers it as the last scanned child, and
> > > reclaims all zones in it with decreasing priority levels.
> > >
> > > The new hierarchy reclaim code will pick memory cgroups from the same
> > > hierarchy concurrently from different zones and priority levels, it
> > > becomes necessary that hierarchy roots not only remember the last
> > > scanned child, but do so for each zone and priority level.
> > >
> > > Furthermore, detecting full hierarchy round-trips reliably will become
> > > crucial, so instead of counting on one iterator site seeing a certain
> > > memory cgroup twice, use a generation counter that is increased every
> > > time the child with the highest ID has been visited.
> > >
> > > Signed-off-by: Johannes Weiner <jweiner@...hat.com>
> >
> > I cannot image how this works. could you illustrate more with easy example ?
>
> Previously, we did
>
> mem = mem_cgroup_iter(root)
> for each priority level:
> for each zone in zonelist:
>
> and this would reclaim memcg-1-zone-1, memcg-1-zone-2, memcg-1-zone-3
> etc.
>
yes.
> The new code does
>
> for each priority level
> for each zone in zonelist
> mem = mem_cgroup_iter(root)
>
> but with a single last_scanned_child per memcg, this would scan
> memcg-1-zone-1, memcg-2-zone-2, memcg-3-zone-3 etc, which does not
> make much sense.
>
> Now imagine two reclaimers. With the old code, the first reclaimer
> would pick memcg-1 and scan all its zones, the second reclaimer would
> pick memcg-2 and reclaim all its zones. Without this patch, the first
> reclaimer would pick memcg-1 and scan zone-1, the second reclaimer
> would pick memcg-2 and scan zone-1, then the first reclaimer would
> pick memcg-3 and scan zone-2. If the reclaimers are concurrently
> scanning at different priority levels, things are even worse because
> one reclaimer may put much more force on the memcgs it gets from
> mem_cgroup_iter() than the other reclaimer. They must not share the
> same iterator.
>
> The generations are needed because the old algorithm did not rely too
> much on detecting full round-trips. After every reclaim cycle, it
> checked the limit and broke out of the loop if enough was reclaimed,
> no matter how many children were reclaimed from. The new algorithm is
> used for global reclaim, where the only exit condition of the
> hierarchy reclaim is the full roundtrip, because equal pressure needs
> to be applied to all zones.
>
Hm, ok, maybe good for global reclam.
Is this used for both of reclaim-by-limit and global-reclaim ?
If so, I need to abandon node-selection-logic for reclaim-by-limit
and nodemask-for-memcg which shows me very good result.
I'll be sad ;)
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists