[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTim6ATcv_MOi0JJorH-wpTk1bUyyeAhbrUkyNimT@mail.gmail.com>
Date: Mon, 8 Nov 2010 11:00:56 -0800
From: Greg Thelen <gthelen@...gle.com>
To: Wu Fengguang <fengguang.wu@...el.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Minchan Kim <minchan.kim@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Young <hidave.darkstar@...il.com>,
Andrea Righi <arighi@...eler.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: memcg writeout throttling, was: [patch 4/4] memcg: use native
word page statistics counters
On Mon, Nov 8, 2010 at 7:45 AM, Wu Fengguang <fengguang.wu@...el.com> wrote:
> On Mon, Nov 08, 2010 at 05:37:16PM +0800, Johannes Weiner wrote:
>> On Mon, Nov 08, 2010 at 09:07:35AM +0900, Minchan Kim wrote:
>> > BTW, let me ask a question.
>> > dirty_writeback_pages seems to be depends on mem_cgroup_page_stat's
>> > result(ie, negative) for separate global and memcg.
>> > But mem_cgroup_page_stat could return negative value by per-cpu as
>> > well as root cgroup.
>> > If I understand right, Isn't it a problem?
>>
>> Yes, the numbers are not reliable and may be off by some. It appears
>> to me that the only sensible interpretation of a negative sum is to
>> assume zero, though. So to be honest, I don't understand the fallback
>> to global state when the local state fluctuates around low values.
>
> Agreed. It does not make sense to compare values from different domains.
>
> The bdi stats use percpu_counter_sum_positive() which never return
> negative values. It may be suitable for memcg page counts, too.
>
>> This function is also only used in throttle_vm_writeout(), where the
>> outcome is compared to the global dirty threshold. So using the
>> number of writeback pages _from the current cgroup_ and falling back
>> to global writeback pages when this number is low makes no sense to me
>> at all.
>>
>> I looks like it should rather compare the cgroup state with the cgroup
>> limit, and the global state with the global limit.
>
> Right.
>
>> Can somebody explain the reasoning behind this? And in case it makes
>> sense after all, put a comment into this function?
>
> It seems a better match to test sc->mem_cgroup rather than
> mem_cgroup_from_task(current). The latter could make mismatches. When
> someone is changing the memcg limits and hence triggers memcg
> reclaims, the current task is actually the (unrelated) shell. It's
> also possible for the memcg task to trigger _global_ direct reclaim.
Good point. I am writing a patch that will pass mem_cgroup from
sc->mem_cgroup into mem_cgroup_page_stat() rather than using
mem_cgroup_from_task(current). I will post this patch in a few hours.
I will also fix the negative value issue in mem_cgroup_page_stat().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists