linux-kernel - Re: [PATCH v3 0/9] reduce memory usage by page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 27 Jun 2016 14:45:18 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Vlastimil Babka <vbabka@...e.cz>, mgorman@...hsingularity.net,
	Minchan Kim <minchan@...nel.org>,
	Alexander Potapenko <glider@...gle.com>,
	Hugh Dickins <hughd@...gle.com>,
	Michal Hocko <mhocko@...nel.org>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Sasha Levin <sasha.levin@...cle.com>
Subject: Re: [PATCH v3 0/9] reduce memory usage by page_owner

On Fri, Jun 24, 2016 at 04:19:35PM -0700, Andrew Morton wrote:
> On Fri, 17 Jun 2016 16:57:30 +0900 js1304@...il.com wrote:
> 
> > There was a bug reported by Sasha and minor fixes is needed
> > so I send v3.
> > 
> > o fix a bg reported by Sasha (mm/compaction: split freepages
> > without holding the zone lock)
> > o add code comment for todo list (mm/page_owner: use stackdepot
> > to store stacktrace) per Michal
> > o add 'inline' keyword (mm/page_alloc: introduce post allocation
> > processing on page allocator) per Vlastimil
> > o add a patch that clean-up code per Vlastimil
> 
> I've gone through v3 patches 2-9 and have plucked out the deltas to
> take what-i-had and turn that into what-you-sent.  Patch 1/9 has seen a
> lot of competing churn in isolate_freepages_block(), so please review
> the current version of that, below.  Between the "===" markers:

Hello, Andrew.

Below looks okay to me.

Thanks.

> 
> 
> static unsigned long isolate_freepages_block(struct compact_control *cc,
> 				unsigned long *start_pfn,
> 				unsigned long end_pfn,
> 				struct list_head *freelist,
> 				bool strict)
> {
> 	int nr_scanned = 0, total_isolated = 0;
> 	struct page *cursor, *valid_page = NULL;
> 	unsigned long flags = 0;
> 	bool locked = false;
> 	unsigned long blockpfn = *start_pfn;
> 	unsigned int order;
> 
> 	cursor = pfn_to_page(blockpfn);
> 
> 	/* Isolate free pages. */
> 	for (; blockpfn < end_pfn; blockpfn++, cursor++) {
> 		int isolated;
> 		struct page *page = cursor;
> 
> 		/*
> 		 * Periodically drop the lock (if held) regardless of its
> 		 * contention, to give chance to IRQs. Abort if fatal signal
> 		 * pending or async compaction detects need_resched()
> 		 */
> 		if (!(blockpfn % SWAP_CLUSTER_MAX)
> 		    && compact_unlock_should_abort(&cc->zone->lock, flags,
> 								&locked, cc))
> 			break;
> 
> 		nr_scanned++;
> 		if (!pfn_valid_within(blockpfn))
> 			goto isolate_fail;
> 
> 		if (!valid_page)
> 			valid_page = page;
> 
> 		/*
> 		 * For compound pages such as THP and hugetlbfs, we can save
> 		 * potentially a lot of iterations if we skip them at once.
> 		 * The check is racy, but we can consider only valid values
> 		 * and the only danger is skipping too much.
> 		 */
> 		if (PageCompound(page)) {
> 			unsigned int comp_order = compound_order(page);
> 
> 			if (likely(comp_order < MAX_ORDER)) {
> 				blockpfn += (1UL << comp_order) - 1;
> 				cursor += (1UL << comp_order) - 1;
> 			}
> 
> 			goto isolate_fail;
> 		}
> 
> 		if (!PageBuddy(page))
> 			goto isolate_fail;
> 
> ====================
> 		/*
> 		 * If we already hold the lock, we can skip some rechecking.
> 		 * Note that if we hold the lock now, checked_pageblock was
> 		 * already set in some previous iteration (or strict is true),
> 		 * so it is correct to skip the suitable migration target
> 		 * recheck as well.
> 		 */
> 		if (!locked) {
> 			/*
> 			 * The zone lock must be held to isolate freepages.
> 			 * Unfortunately this is a very coarse lock and can be
> 			 * heavily contended if there are parallel allocations
> 			 * or parallel compactions. For async compaction do not
> 			 * spin on the lock and we acquire the lock as late as
> 			 * possible.
> 			 */
> 			locked = compact_trylock_irqsave(&cc->zone->lock,
> 								&flags, cc);
> 			if (!locked)
> 				break;
> 
> 			/* Recheck this is a buddy page under lock */
> 			if (!PageBuddy(page))
> 				goto isolate_fail;
> 		}
> 
> 		/* Found a free page, will break it into order-0 pages */
> 		order = page_order(page);
> 		isolated = __isolate_free_page(page, order);
> 		if (!isolated)
> 			break;
> 		set_page_private(page, order);
> 
> 		total_isolated += isolated;
> 		cc->nr_freepages += isolated;
> 		list_add_tail(&page->lru, freelist);
> 
> 		if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
> 			blockpfn += isolated;
> 			break;
> 		}
> 		/* Advance to the end of split page */
> 		blockpfn += isolated - 1;
> 		cursor += isolated - 1;
> 		continue;
> 
> isolate_fail:
> =====================
> 		if (strict)
> 			break;
> 		else
> 			continue;
> 
> 	}
> 
> 	if (locked)
> 		spin_unlock_irqrestore(&cc->zone->lock, flags);
> 
> 	/*
> 	 * There is a tiny chance that we have read bogus compound_order(),
> 	 * so be careful to not go outside of the pageblock.
> 	 */
> 	if (unlikely(blockpfn > end_pfn))
> 		blockpfn = end_pfn;
> 
> 	trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
> 					nr_scanned, total_isolated);
> 
> 	/* Record how far we have got within the block */
> 	*start_pfn = blockpfn;
> 
> 	/*
> 	 * If strict isolation is requested by CMA then check that all the
> 	 * pages requested were isolated. If there were any failures, 0 is
> 	 * returned and CMA will fail.
> 	 */
> 	if (strict && blockpfn < end_pfn)
> 		total_isolated = 0;
> 
> 	/* Update the pageblock-skip if the whole pageblock was scanned */
> 	if (blockpfn == end_pfn)
> 		update_pageblock_skip(cc, valid_page, total_isolated, false);
> 
> 	count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
> 	if (total_isolated)
> 		count_compact_events(COMPACTISOLATED, total_isolated);
> 	return total_isolated;
> }
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>