lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Mar 2021 18:43:05 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Chuck Lever <chuck.lever@...cle.com>,
        Christoph Hellwig <hch@...radead.org>,
        Alexander Duyck <alexander.duyck@...il.com>,
        Matthew Wilcox <willy@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-Net <netdev@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Linux-NFS <linux-nfs@...r.kernel.org>
Subject: Re: [PATCH 2/3] mm/page_alloc: Add a bulk page allocator

On Tue, Mar 23, 2021 at 05:00:08PM +0100, Jesper Dangaard Brouer wrote:
> > +	/*
> > +	 * If there are no allowed local zones that meets the watermarks then
> > +	 * try to allocate a single page and reclaim if necessary.
> > +	 */
> > +	if (!zone)
> > +		goto failed;
> > +
> > +	/* Attempt the batch allocation */
> > +	local_irq_save(flags);
> > +	pcp = &this_cpu_ptr(zone->pageset)->pcp;
> > +	pcp_list = &pcp->lists[ac.migratetype];
> > +
> > +	while (allocated < nr_pages) {
> > +		page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags,
> > +								pcp, pcp_list);
> 
> The function __rmqueue_pcplist() is now used two places, this cause the
> compiler to uninline the static function.
> 

This was expected. It was not something I was particularly happy with
but avoiding it was problematic without major refactoring.

> My tests show you should inline __rmqueue_pcplist().  See patch I'm
> using below signature, which also have some benchmark notes. (Please
> squash it into your patch and drop these notes).
> 

The cycle savings per element is very marginal at just 4 cycles. I
expect just the silly stat updates are way more costly but the series
that addresses that is likely to be controversial. As I know the cycle
budget for processing a packet is tight, I've applied the patch but am
keeping it separate to preserve the data in case someone points out that
is a big function to inline and "fixes" it.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ