lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20071114112531.GC9566@skynet.ie>
Date:	Wed, 14 Nov 2007 11:25:31 +0000
From:	mel@...net.ie (Mel Gorman)
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Alexey Dobriyan <adobriyan@...il.com>, akpm@...l.org,
	torvalds@...l.org, linux-kernel@...r.kernel.org,
	linux-mm@...r.kernel.org
Subject: Re: Major mke2fs slowdown (reproducable, bisected)

On (13/11/07 17:25), Andi Kleen didst pronounce:
> Alexey Dobriyan <adobriyan@...il.com> writes:
> >  
> > +/* Return the page with the lowest PFN in the list */
> > +static struct page *min_page(struct list_head *list)
> > +{
> > +	unsigned long min_pfn = -1UL;
> > +	struct page *min_page = NULL, *page;;
> > +
> > +	list_for_each_entry(page, list, lru) {
> > +		unsigned long pfn = page_to_pfn(page);
> > +		if (pfn < min_pfn) {
> > +			min_pfn = pfn;
> > +			min_page = page;
> > +		}
> > +	}
> > +
> > +	return min_page;
> > +}
> > +
> >  /* Remove an element from the buddy allocator from the fallback list */
> >  static struct page *__rmqueue_fallback(struct zone *zone, int order,
> >  						int start_migratetype)
> > @@ -795,8 +812,11 @@ retry:
> >  			if (list_empty(&area->free_list[migratetype]))
> >  				continue;
> >  
> > +			/* Bias kernel allocations towards low pfns */
> >  			page = list_entry(area->free_list[migratetype].next,
> >  					struct page, lru);
> > +			if (unlikely(start_migratetype != MIGRATE_MOVABLE))
> > +				page = min_page(&area->free_list[migratetype]);
> 
> Do I misread this, or does it really turn the O(1) buddy allocation into
> a "search whole free list" algorithm?  Even as fallback that looks like
> a quite extreme thing to do.
> 

It's extreme but not *quite* as extreme as you imply. The whole free-lists are
not searched, just one set at a specific order so it's "search a portion of
the free-lists". It happens for non-movable allocations (usually the minority)
and only then in fallback (in itself quite rare in almost all cases I've seen).

The problem was not detected before by me because it wasn't just a case of
creating a large number of pinned allocations but also depended on the type
of workload preceding it. If mke2fs was long-lived, it might not even have
been noticed. When run more than once, the fallbacks have all been dealt
with and it goes back to normal times.

The patch is now reverted and I don't expect to try bringing it back.
There are ways to bias the placement the pages as the patch intended without
doing an expensive search.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ