netdev - RE: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6D5F50B52E@AcuExch.aculab.com>
Date:	Fri, 5 Aug 2016 15:33:27 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Alexander Duyck' <alexander.duyck@...il.com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:	Jesper Dangaard Brouer <brouer@...hat.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Brenden Blanco <bblanco@...mgrid.com>,
	David Miller <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>,
	Jamal Hadi Salim <jhs@...atatu.com>,
	Saeed Mahameed <saeedm@....mellanox.co.il>,
	"Martin KaFai Lau" <kafai@...com>, Ari Saha <as754m@....com>,
	Or Gerlitz <gerlitz.or@...il.com>,
	john fastabend <john.fastabend@...il.com>,
	"Hannes Frederic Sowa" <hannes@...essinduktion.org>,
	Thomas Graf <tgraf@...g.ch>,
	"Tom Herbert" <tom@...bertland.com>,
	Daniel Borkmann <daniel@...earbox.net>,
	"Tariq Toukan" <ttoukan.linux@...il.com>,
	Mel Gorman <mgorman@...hsingularity.net>,
	linux-mm <linux-mm@...ck.org>
Subject: RE: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12]
 net/mlx4_en: add page recycle to prepare rx ring for tx support

From: Alexander Duyck
> Sent: 05 August 2016 16:15
...
> >
> > interesting idea. Like dma_map 1GB region and then allocate
> > pages from it only? but the rest of the kernel won't be able
> > to use them? so only some smaller region then? or it will be
> > a boot time flag to reserve this pseudo-huge page?
> 
> Yeah, something like that.  If we were already talking about
> allocating a pool of pages it might make sense to just setup something
> like this where you could reserve a 1GB region for a single 10G device
> for instance.  Then it would make the whole thing much easier to deal
> with since you would have a block of memory that should perform very
> well in terms of DMA accesses.

ISTM that the main kernel allocator ought to be keeping a cache
of pages that are mapped into the various IOMMU.
This might be a per-driver cache, but could be much wider.

Then if some code wants such a page it can be allocated one that is
already mapped.
Under memory pressure the pages could then be reused for other purposes.

...
> In the Intel drivers for instance if the frame
> size is less than 256 bytes we just copy the whole thing out since it
> is cheaper to just extend the header copy rather than taking the extra
> hit for get_page/put_page.

How fast is 'rep movsb' (on cached addresses) on recent x86 cpu?
It might actually be worth unconditionally copying the entire frame
on those cpus.

A long time ago we found the breakeven point for the copy to be about
1kb on sparc mbus/sbus systems - and that might not have been aligning
the copy.

	David