linux-kernel - Re: [PATCH] vmscan: limit concurrent reclaimers in shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 14 Dec 2009 09:23:19 -0500
From:	Larry Woodman <lwoodman@...hat.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Rik van Riel <riel@...hat.com>, kosaki.motohiro@...fujitsu.com,
	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, aarcange@...hat.com
Subject: Re: [PATCH] vmscan: limit concurrent reclaimers in shrink_zone

On Mon, 2009-12-14 at 14:08 +0100, Andi Kleen wrote:
> Rik van Riel <riel@...hat.com> writes:
> 
> > +max_zone_concurrent_reclaim:
> > +
> > +The number of processes that are allowed to simultaneously reclaim
> > +memory from a particular memory zone.
> > +
> > +With certain workloads, hundreds of processes end up in the page
> > +reclaim code simultaneously.  This can cause large slowdowns due
> > +to lock contention, freeing of way too much memory and occasionally
> > +false OOM kills.
> > +
> > +To avoid these problems, only allow a smaller number of processes
> > +to reclaim pages from each memory zone simultaneously.
> > +
> > +The default value is 8.
> 
> I don't like the hardcoded number. Is the same number good for a 128MB
> embedded system as for as 1TB server?  Seems doubtful.
> 
> This should be perhaps scaled with memory size and number of CPUs?

Remember this a per-zone number.

> 
> > +/*
> > + * Maximum number of processes concurrently running the page
> > + * reclaim code in a memory zone.  Having too many processes
> > + * just results in them burning CPU time waiting for locks,
> > + * so we're better off limiting page reclaim to a sane number
> > + * of processes at a time.  We do this per zone so local node
> > + * reclaim on one NUMA node will not block other nodes from
> > + * making progress.
> > + */
> > +int max_zone_concurrent_reclaimers = 8;
> 
> __read_mostly
> 
> > +
> >  static LIST_HEAD(shrinker_list);
> >  static DECLARE_RWSEM(shrinker_rwsem);
> >  
> > @@ -1600,6 +1612,29 @@ static void shrink_zone(int priority, struct zone *zone,
> >  	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
> >  	int noswap = 0;
> >  
> > +	if (!current_is_kswapd() && atomic_read(&zone->concurrent_reclaimers) >
> > +					max_zone_concurrent_reclaimers) {
> > +		/*
> > +		 * Do not add to the lock contention if this zone has
> > +		 * enough processes doing page reclaim already, since
> > +		 * we would just make things slower.
> > +		 */
> > +		sleep_on(&zone->reclaim_wait);
> 
> wait_event()? sleep_on is a really deprecated racy interface.
> 
> This would still badly thunder the herd if not enough memory is freed
> , won't it? It would be better to only wake up a single process if memory got freed.
> 
> How about for each page freed do a wake up for one thread?
> 
> 
> -Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/