linux-kernel - Re: [PATCH 4/7] Memory compaction core

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100106220725.GD5426@csn.ul.ie>
Date:	Wed, 6 Jan 2010 22:07:25 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Andrea Arcangeli <aarcange@...hat.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Adam Litke <agl@...ibm.com>, Avi Kivity <avi@...hat.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 4/7] Memory compaction core

On Wed, Jan 06, 2010 at 10:37:22PM +0100, Andi Kleen wrote:
> Mel Gorman <mel@....ul.ie> writes:
> 
> 
> Haven't reviewed the full thing, but one thing I noticed below:
> 
> > +
> > +	/*
> > +	 * Isolate free pages until enough are available to migrate the
> > +	 * pages on cc->migratepages. We stop searching if the migrate
> > +	 * and free page scanners meet or enough free pages are isolated.
> > +	 */
> > +	spin_lock_irq(&zone->lock);
> 
> Won't that cause very long lock hold times on large zones?

Good question.  The amount of memory unavailable and the duration should
be bounded.

isolate_migratepages only considers a pageblock of pages, the maximum of
which will be MAX_ORDER_NR_PAGES so ordinarily you would expect the hold
time to be fairly short - even on large zones.

The one exception is if migration of too many of these pages are failing. The
pages are not immediately put back on the LRU list. In a really bad scenario,
too many free pages could indeed get isolated. I comment on this problem
although from another perspective here

         * XXX: Page migration at this point tries fairly hard to move
         *      pages as it is but if migration fails, pages are left
         *      on cc->migratepages for more passes. This might cause
         *      multiple useless failures. Watch
         *      compact_pagemigrate_failed
         *      in /proc/vmstat. If it grows a lot, then putback should
         *      happen after each failed migration

So, in theory in a worst case scenario, it could grow too much. The
solution would be to put pages that fail to migrate back on the LRU
list. That would keep the length of time zone->lock is held low.

Even in that worst case scenario, there is a limit to how many pages will
be removed from the free lists. When isolating free pages, split_free_page
is called and one of the checks it makes is

       /* Obey watermarks or the system could deadlock */
        watermark = low_wmark_pages(zone) + (1 << order);
        if (!zone_watermark_ok(zone, 0, watermark, 0, 0))
                return 0;

i.e. it shouldn't be isolating pages if watermarks get messed up. If
enough free pages are not available, migration should fail, compaction
therefore fails and all the pages get put back.

Bottom line, I do not expect it to be bad. I'm much more concerned about
zone->lock getting hammered by isolating free pages, then giving them
back because page migration keeps failing and freeing the isolated pages
back to the lists.

> Presumably you need some kind of lock break heuristic.
> 

The heuristic I'm going for is "never be taking too many pages".

Just in case though, I'll put in a

	WARN_ON_ONCE(nr_migratepages > MAX_ORDER_NR_PAGES * 3);

in isolate_free_pages. If that warning triggers, it likely means the
lock is being held too long.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/