lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090304084928.FD57.A69D9226@jp.fujitsu.com>
Date:	Wed,  4 Mar 2009 09:07:21 +0900 (JST)
From:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To:	balbir@...ux.vnet.ibm.com
Cc:	kosaki.motohiro@...fujitsu.com, linux-mm@...ck.org,
	Sudhir Kumar <skumar@...ux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@...inux.co.jp>,
	Bharata B Rao <bharata@...ibm.com>,
	Paul Menage <menage@...gle.com>, lizf@...fujitsu.com,
	linux-kernel@...r.kernel.org, David Rientjes <rientjes@...gle.com>,
	Pavel Emelianov <xemul@...nvz.org>,
	Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH 4/4] Memory controller soft limit reclaim on contention (v3)

Hi Balbir

> > > > kswapd's roll is increasing free pages until zone->pages_high in "own node".
> > > > mem_cgroup_soft_limit_reclaim() free one (or more) exceed page in any node.
> > > > 
> > > > Oh, well.
> > > > I think it is not consistency.
> > > > 
> > > > if mem_cgroup_soft_limit_reclaim() is aware to target node and its pages_high,
> > > > I'm glad.
> > > 
> > > Yes, correct the role of kswapd is to keep increasing free pages until
> > > zone->pages_high and the first set of pages to consider is the memory
> > > controller over their soft limits. We pass the zonelist to ensure that
> > > while doing soft reclaim, we focus on the zonelist associated with the
> > > node. Kamezawa had concernes over calling the soft limit reclaim from
> > > __alloc_pages_internal(), did you prefer that call path? 
> > 
> > I read your patch again.
> > So, mem_cgroup_soft_limit_reclaim() caller place seems in balance_pgdat() is better.
> > 
> > Please imazine most bad scenario.
> > CPU0 (kswapd) take to continue shrinking.
> > CPU1 take another activity and charge memcg conteniously.
> > At that time, balance_pgdat() don't exit very long time. then 
> > mem_cgroup_soft_limit_reclaim() is never called.
> > 
> 
> Yes, true... that is why I added the hooks in __alloc_pages_internal()
> in the first two revisions, but Kamezawa objected to them. In the
> scenario that you mention that balance_pgdat() is busy, if we are
> under global system memory pressure, even after freeing memory from
> soft limited cgroups, we don't have sufficient free memory. We need to
> go reclaim from the whole system. An administrator can easily avoid
> the above scenario by using hard limits on the cgroup running on CPU1.

I agree with soft limit implementation is difficult.

but I still don't like soft limit in __alloc_pages_internal().
if it does, kswapd reclaim the pages from global LRU *before*
shrinking soft limit.

again, linux reclaim policy is

	free < pages_low:  run kswapd
	free < pages_min:  foreground reclaim via __alloc_pages_internal()

then, if soft limit reclaim put into __alloc_pages_internal(),

	free < pages_low:  run kswapd
	free < pages_min:  soft limit reclaim and 
                           foreground reclaim via __alloc_pages_internal()

it seems unintetional behavior.

In addition, I still strongly oppose againt global lock although 
soft limit shrinking don't put into __alloc_pages_internal().
I think it doesn't depend on caller place.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ