netdev - Re: [Lsf-pc] [LSF/MM TOPIC] Generic page-pool recycle facility?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160411085819.GE21128@suse.de>
Date:	Mon, 11 Apr 2016 09:58:19 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Jesper Dangaard Brouer <brouer@...hat.com>
Cc:	lsf@...ts.linux-foundation.org, linux-mm <linux-mm@...ck.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Brenden Blanco <bblanco@...mgrid.com>,
	James Bottomley <James.Bottomley@...senPartnership.com>,
	Tom Herbert <tom@...bertland.com>,
	lsf-pc@...ts.linux-foundation.org,
	Alexei Starovoitov <alexei.starovoitov@...il.com>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] Generic page-pool recycle facility?

On Thu, Apr 07, 2016 at 04:17:15PM +0200, Jesper Dangaard Brouer wrote:
> (Topic proposal for MM-summit)
> 
> Network Interface Cards (NIC) drivers, and increasing speeds stress
> the page-allocator (and DMA APIs).  A number of driver specific
> open-coded approaches exists that work-around these bottlenecks in the
> page allocator and DMA APIs. E.g. open-coded recycle mechanisms, and
> allocating larger pages and handing-out page "fragments".
> 
> I'm proposing a generic page-pool recycle facility, that can cover the
> driver use-cases, increase performance and open up for zero-copy RX.
> 

Which bottleneck dominates -- the page allocator or the DMA API when
setting up coherent pages?

I'm wary of another page allocator API being introduced if it's for
performance reasons. In response to this thread, I spent two days on
a series that boosts performance of the allocator in the fast paths by
11-18% to illustrate that there was low-hanging fruit for optimising. If
the one-LRU-per-node series was applied on top, there would be a further
boost to performance on the allocation side. It could be further boosted
if debugging checks and statistic updates were conditionally disabled by
the caller.

The main reason another allocator concerns me is that those pages
are effectively pinned and cannot be reclaimed by the VM in low memory
situations. It ends up needing its own API for tuning the size and hoping
all the drivers get it right without causing OOM situations. It becomes
a slippery slope of introducing shrinkers, locking and complexity. Then
callers start getting concerned about NUMA locality and having to deal
with multiple lists to maintain performance. Ultimately, it ends up being
as slow as the page allocator and back to square 1 except now with more code.

If it's the DMA API that dominates then something may be required but it
should rely on the existing page allocator to alloc/free from. It would
also need something like drain_all_pages to force free everything in there
in low memory situations. Remember that multiple instances private to
drivers or tasks will require shrinker implementations and the complexity
may get unwieldly.

-- 
Mel Gorman
SUSE Labs