[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101108154524.GA9530@localhost>
Date: Mon, 8 Nov 2010 23:45:24 +0800
From: Wu Fengguang <fengguang.wu@...el.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Minchan Kim <minchan.kim@...il.com>,
Greg Thelen <gthelen@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Young <hidave.darkstar@...il.com>,
Andrea Righi <arighi@...eler.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: memcg writeout throttling, was: [patch 4/4] memcg: use native
word page statistics counters
On Mon, Nov 08, 2010 at 05:37:16PM +0800, Johannes Weiner wrote:
> On Mon, Nov 08, 2010 at 09:07:35AM +0900, Minchan Kim wrote:
> > BTW, let me ask a question.
> > dirty_writeback_pages seems to be depends on mem_cgroup_page_stat's
> > result(ie, negative) for separate global and memcg.
> > But mem_cgroup_page_stat could return negative value by per-cpu as
> > well as root cgroup.
> > If I understand right, Isn't it a problem?
>
> Yes, the numbers are not reliable and may be off by some. It appears
> to me that the only sensible interpretation of a negative sum is to
> assume zero, though. So to be honest, I don't understand the fallback
> to global state when the local state fluctuates around low values.
Agreed. It does not make sense to compare values from different domains.
The bdi stats use percpu_counter_sum_positive() which never return
negative values. It may be suitable for memcg page counts, too.
> This function is also only used in throttle_vm_writeout(), where the
> outcome is compared to the global dirty threshold. So using the
> number of writeback pages _from the current cgroup_ and falling back
> to global writeback pages when this number is low makes no sense to me
> at all.
>
> I looks like it should rather compare the cgroup state with the cgroup
> limit, and the global state with the global limit.
Right.
> Can somebody explain the reasoning behind this? And in case it makes
> sense after all, put a comment into this function?
It seems a better match to test sc->mem_cgroup rather than
mem_cgroup_from_task(current). The latter could make mismatches. When
someone is changing the memcg limits and hence triggers memcg
reclaims, the current task is actually the (unrelated) shell. It's
also possible for the memcg task to trigger _global_ direct reclaim.
Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists