lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101108154524.GA9530@localhost>
Date:	Mon, 8 Nov 2010 23:45:24 +0800
From:	Wu Fengguang <fengguang.wu@...el.com>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Minchan Kim <minchan.kim@...il.com>,
	Greg Thelen <gthelen@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Young <hidave.darkstar@...il.com>,
	Andrea Righi <arighi@...eler.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: memcg writeout throttling, was: [patch 4/4] memcg: use native
 word page statistics counters

On Mon, Nov 08, 2010 at 05:37:16PM +0800, Johannes Weiner wrote:
> On Mon, Nov 08, 2010 at 09:07:35AM +0900, Minchan Kim wrote:
> > BTW, let me ask a question.
> > dirty_writeback_pages seems to be depends on mem_cgroup_page_stat's
> > result(ie, negative) for separate global and memcg.
> > But mem_cgroup_page_stat could return negative value by per-cpu as
> > well as root cgroup.
> > If I understand right, Isn't it a problem?
> 
> Yes, the numbers are not reliable and may be off by some.  It appears
> to me that the only sensible interpretation of a negative sum is to
> assume zero, though.  So to be honest, I don't understand the fallback
> to global state when the local state fluctuates around low values.

Agreed. It does not make sense to compare values from different domains.

The bdi stats use percpu_counter_sum_positive() which never return
negative values. It may be suitable for memcg page counts, too.

> This function is also only used in throttle_vm_writeout(), where the
> outcome is compared to the global dirty threshold.  So using the
> number of writeback pages _from the current cgroup_ and falling back
> to global writeback pages when this number is low makes no sense to me
> at all.
> 
> I looks like it should rather compare the cgroup state with the cgroup
> limit, and the global state with the global limit.

Right.

> Can somebody explain the reasoning behind this?  And in case it makes
> sense after all, put a comment into this function?

It seems a better match to test sc->mem_cgroup rather than
mem_cgroup_from_task(current). The latter could make mismatches. When
someone is changing the memcg limits and hence triggers memcg
reclaims, the current task is actually the (unrelated) shell. It's
also possible for the memcg task to trigger _global_ direct reclaim.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ