linux-kernel - Re: [RFC PATCH] mm: page_alloc: High-order per-cpu page allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161123163351.6s76ijwnqoakgcud@techsingularity.net>
Date:   Wed, 23 Nov 2016 16:33:51 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     Linux-MM <linux-mm@...ck.org>, Christoph Lameter <cl@...ux.com>,
        Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] mm: page_alloc: High-order per-cpu page allocator

On Wed, Nov 23, 2016 at 04:37:06PM +0100, Vlastimil Babka wrote:
> On 11/21/2016 04:55 PM, Mel Gorman wrote:
> 
> ...
> 
> > hackbench was also tested with both socket and pipes and both processes
> > and threads and the results are interesting in terms of how variability
> > is imapcted
> > 
> > 1-socket machine -- pipes and processes
> >                         4.9.0-rc5             4.9.0-rc5
> >                           vanilla        highmark-v1r12
> > Amean    1      12.9637 (  0.00%)     12.9570 (  0.05%)
> > Amean    3      13.4770 (  0.00%)     13.4447 (  0.24%)
> > Amean    5      18.5333 (  0.00%)     19.0917 ( -3.01%)
> > Amean    7      24.5690 (  0.00%)     26.1010 ( -6.24%)
> > Amean    12     39.7990 (  0.00%)     40.6763 ( -2.20%)
> > Amean    16     56.0520 (  0.00%)     58.2530 ( -3.93%)
> 
> Here, higher values are better or worse?
> 

Higher values are worse.

> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 0f088f3a2fed..02eb24d90d70 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -255,6 +255,24 @@ enum zone_watermarks {
> >  	NR_WMARK
> >  };
> > 
> > +/*
> > + * One per migratetype for order-0 pages and one per high-order up to
> > + * and including PAGE_ALLOC_COSTLY_ORDER. This may allow unmovable
> > + * allocations to contaminate reclaimable pageblocks if high-order
> > + * pages are heavily used.
> > + */
> > +#define NR_PCP_LISTS (MIGRATE_PCPTYPES + PAGE_ALLOC_COSTLY_ORDER + 1)
> 
> Should it be "- 1" instead of "+ 1"?
> 

Yes.

> > +
> > +static inline unsigned int pindex_to_order(unsigned int pindex)
> > +{
> > +	return pindex < MIGRATE_PCPTYPES ? 0 : pindex - MIGRATE_PCPTYPES + 1;
> > +}
> > +
> > +static inline unsigned int order_to_pindex(int migratetype, unsigned int order)
> > +{
> > +	return (order == 0) ? migratetype : MIGRATE_PCPTYPES - 1 + order;
> 
> Here I think that "MIGRATE_PCPTYPES + order - 1" would be easier to
> understand as the array is for all migratetypes, but the order is shifted?
> 

As in migratetypes * costly_order ? That would be excessively large.

> > @@ -1083,10 +1083,12 @@ static bool bulkfree_pcp_prepare(struct page *page)
> >   * pinned" detection logic.
> >   */
> >  static void free_pcppages_bulk(struct zone *zone, int count,
> > -					struct per_cpu_pages *pcp)
> > +					struct per_cpu_pages *pcp,
> > +					int migratetype)
> >  {
> > -	int migratetype = 0;
> > -	int batch_free = 0;
> > +	unsigned int pindex = 0;
> 
> Should pindex be initialized to migratetype to match the list below?
> 

Functionally it doesn't matter. It affects which list is tried first if
the preferred list is empty. Arguably it would make more sense to init
it to NR_PCP_LISTS - 1 so all order-0 lists are always drained before the
high-order pages but there is not much justification for that.

I'll take your suggestion until there is data supporting that high-order
caches should be preserved.

Thanks.

-- 
Mel Gorman
SUSE Labs