linux-kernel - Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161130141613.gnf63khbrzrps7ip@techsingularity.net>
Date:   Wed, 30 Nov 2016 14:16:13 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Lameter <cl@...ux.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Linux-MM <linux-mm@...ck.org>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v3

On Wed, Nov 30, 2016 at 02:05:50PM +0100, Michal Hocko wrote:
> On Sun 27-11-16 13:19:54, Mel Gorman wrote:
> [...]
> > @@ -2588,18 +2594,22 @@ struct page *buffered_rmqueue(struct zone *preferred_zone,
> >  	struct page *page;
> >  	bool cold = ((gfp_flags & __GFP_COLD) != 0);
> >  
> > -	if (likely(order == 0)) {
> > +	if (likely(order <= PAGE_ALLOC_COSTLY_ORDER)) {
> >  		struct per_cpu_pages *pcp;
> >  		struct list_head *list;
> >  
> >  		local_irq_save(flags);
> >  		do {
> > +			unsigned int pindex;
> > +
> > +			pindex = order_to_pindex(migratetype, order);
> >  			pcp = &this_cpu_ptr(zone->pageset)->pcp;
> > -			list = &pcp->lists[migratetype];
> > +			list = &pcp->lists[pindex];
> >  			if (list_empty(list)) {
> > -				pcp->count += rmqueue_bulk(zone, 0,
> > +				int nr_pages = rmqueue_bulk(zone, order,
> >  						pcp->batch, list,
> >  						migratetype, cold);
> > +				pcp->count += (nr_pages << order);
> >  				if (unlikely(list_empty(list)))
> >  					goto failed;
> 
> just a nit, we can reorder the check and the count update because nobody
> could have stolen pages allocated by rmqueue_bulk.

Ok, it's minor but I can do that.

> I would also consider
> nr_pages a bit misleading because we get a number or allocated elements.
> Nothing to lose sleep over...
> 

I didn't think of a clearer name because in this sort of context, I consider
a high-order page to be a single page.

> >  			}
> 
> But...  Unless I am missing something this effectively means that we do
> not exercise high order atomic reserves. Shouldn't we fallback to
> the locked __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC) for
> order > 0 && ALLOC_HARDER ? Or is this just hidden in some other code
> path which I am not seeing?
> 

Good spot, would this be acceptable to you?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 91dc68c2a717..94808f565f74 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2609,9 +2609,18 @@ struct page *buffered_rmqueue(struct zone *preferred_zone,
 				int nr_pages = rmqueue_bulk(zone, order,
 						pcp->batch, list,
 						migratetype, cold);
-				pcp->count += (nr_pages << order);
-				if (unlikely(list_empty(list)))
+				if (unlikely(list_empty(list))) {
+					/*
+					 * Retry high-order atomic allocs
+					 * from the buddy list which may
+					 * use MIGRATE_HIGHATOMIC.
+					 */
+					if (order && (alloc_flags & ALLOC_HARDER))
+						goto try_buddylist;
+
 					goto failed;
+				}
+				pcp->count += (nr_pages << order);
 			}
 
 			if (cold)
@@ -2624,6 +2633,7 @@ struct page *buffered_rmqueue(struct zone *preferred_zone,
 
 		} while (check_new_pcp(page));
 	} else {
+try_buddylist:
 		/*
 		 * We most definitely don't want callers attempting to
 		 * allocate greater than order-1 page units with __GFP_NOFAIL.
-- 
Mel Gorman
SUSE Labs