lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 5 Aug 2016 15:33:27 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Alexander Duyck' <alexander.duyck@...il.com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:	Jesper Dangaard Brouer <brouer@...hat.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Brenden Blanco <bblanco@...mgrid.com>,
	David Miller <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>,
	Jamal Hadi Salim <jhs@...atatu.com>,
	Saeed Mahameed <saeedm@....mellanox.co.il>,
	"Martin KaFai Lau" <kafai@...com>, Ari Saha <as754m@....com>,
	Or Gerlitz <gerlitz.or@...il.com>,
	john fastabend <john.fastabend@...il.com>,
	"Hannes Frederic Sowa" <hannes@...essinduktion.org>,
	Thomas Graf <tgraf@...g.ch>,
	"Tom Herbert" <tom@...bertland.com>,
	Daniel Borkmann <daniel@...earbox.net>,
	"Tariq Toukan" <ttoukan.linux@...il.com>,
	Mel Gorman <mgorman@...hsingularity.net>,
	linux-mm <linux-mm@...ck.org>
Subject: RE: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12]
 net/mlx4_en: add page recycle to prepare rx ring for tx support

From: Alexander Duyck
> Sent: 05 August 2016 16:15
...
> >
> > interesting idea. Like dma_map 1GB region and then allocate
> > pages from it only? but the rest of the kernel won't be able
> > to use them? so only some smaller region then? or it will be
> > a boot time flag to reserve this pseudo-huge page?
> 
> Yeah, something like that.  If we were already talking about
> allocating a pool of pages it might make sense to just setup something
> like this where you could reserve a 1GB region for a single 10G device
> for instance.  Then it would make the whole thing much easier to deal
> with since you would have a block of memory that should perform very
> well in terms of DMA accesses.

ISTM that the main kernel allocator ought to be keeping a cache
of pages that are mapped into the various IOMMU.
This might be a per-driver cache, but could be much wider.

Then if some code wants such a page it can be allocated one that is
already mapped.
Under memory pressure the pages could then be reused for other purposes.

...
> In the Intel drivers for instance if the frame
> size is less than 256 bytes we just copy the whole thing out since it
> is cheaper to just extend the header copy rather than taking the extra
> hit for get_page/put_page.

How fast is 'rep movsb' (on cached addresses) on recent x86 cpu?
It might actually be worth unconditionally copying the entire frame
on those cpus.

A long time ago we found the breakeven point for the copy to be about
1kb on sparc mbus/sbus systems - and that might not have been aligning
the copy.

	David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ