lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 15 Jun 2011 15:48:25 -0700
From:	Ying Han <yinghan@...gle.com>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Minchan Kim <minchan.kim@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Mel Gorman <mgorman@...e.de>, Greg Thelen <gthelen@...gle.com>,
	Michel Lespinasse <walken@...gle.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [patch 4/8] memcg: rework soft limit reclaim

On Thu, Jun 9, 2011 at 8:00 AM, Michal Hocko <mhocko@...e.cz> wrote:
> On Thu 02-06-11 22:25:29, Ying Han wrote:
>> On Thu, Jun 2, 2011 at 2:55 PM, Ying Han <yinghan@...gle.com> wrote:
>> > On Tue, May 31, 2011 at 11:25 PM, Johannes Weiner <hannes@...xchg.org> wrote:
>> >> Currently, soft limit reclaim is entered from kswapd, where it selects
> [...]
>> >> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> >> index c7d4b44..0163840 100644
>> >> --- a/mm/vmscan.c
>> >> +++ b/mm/vmscan.c
>> >> @@ -1988,9 +1988,13 @@ static void shrink_zone(int priority, struct zone *zone,
>> >>                unsigned long reclaimed = sc->nr_reclaimed;
>> >>                unsigned long scanned = sc->nr_scanned;
>> >>                unsigned long nr_reclaimed;
>> >> +               int epriority = priority;
>> >> +
>> >> +               if (mem_cgroup_soft_limit_exceeded(root, mem))
>> >> +                       epriority -= 1;
>> >
>> > Here we grant the ability to shrink from all the memcgs, but only
>> > higher the priority for those exceed the soft_limit. That is a design
>> > change
>> > for the "soft_limit" which giving a hint to which memcgs to reclaim
>> > from first under global memory pressure.
>>
>>
>> Basically, we shouldn't reclaim from a memcg under its soft_limit
>> unless we have trouble reclaim pages from others.
>
> Agreed.
>
>> Something like the following makes better sense:
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index bdc2fd3..b82ba8c 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1989,6 +1989,8 @@ restart:
>>         throttle_vm_writeout(sc->gfp_mask);
>>  }
>>
>> +#define MEMCG_SOFTLIMIT_RECLAIM_PRIORITY       2
>> +
>>  static void shrink_zone(int priority, struct zone *zone,
>>                                 struct scan_control *sc)
>>  {
>> @@ -2001,13 +2003,13 @@ static void shrink_zone(int priority, struct zone *zone,
>>                 unsigned long reclaimed = sc->nr_reclaimed;
>>                 unsigned long scanned = sc->nr_scanned;
>>                 unsigned long nr_reclaimed;
>> -               int epriority = priority;
>>
>> -               if (mem_cgroup_soft_limit_exceeded(root, mem))
>> -                       epriority -= 1;
>> +               if (!mem_cgroup_soft_limit_exceeded(root, mem) &&
>> +                               priority > MEMCG_SOFTLIMIT_RECLAIM_PRIORITY)
>> +                       continue;
>
> yes, this makes sense but I am not sure about the right(tm) value of the
> MEMCG_SOFTLIMIT_RECLAIM_PRIORITY. 2 sounds too low. You would do quite a
> lot of loops
> (DEFAULT_PRIORITY-MEMCG_SOFTLIMIT_RECLAIM_PRIORITY) * zones * memcg_count
> without any progress (assuming that all of them are under soft limit
> which doesn't sound like a totally artificial configuration) until you
> allow reclaiming from groups that are under soft limit. Then, when you
> finally get to reclaiming, you scan rather aggressively.

Fair enough, something smarter is definitely needed :)

>
> Maybe something like 3/4 of DEFAULT_PRIORITY? You would get 3 times
> over all (unbalanced) zones and all cgroups that are above the limit
> (scanning max{1/4096+1/2048+1/1024, 3*SWAP_CLUSTER_MAX} of the LRUs for
> each cgroup) which could be enough to collect the low hanging fruit.

Hmm, that sounds more reasonable than the initial proposal.

For the same worst case where all the memcgs are blow their soft
limit, we need to scan 3 times of total memcgs before actually doing
anything. For that condition, I can not think of anything solve the
problem totally unless we have separate list of memcg (like what do
currently) per-zone.

--Ying

> --
> Michal Hocko
> SUSE Labs
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9
> Czech Republic
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ