lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 27 Jul 2009 06:55:54 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Brice Goglin <Brice.Goglin@...ia.fr>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [RFC] Idea about increasing efficency of skb allocation in
	network devices

On Mon, Jul 27, 2009 at 09:58:22AM +0200, Eric Dumazet wrote:
> Brice Goglin a écrit :
> > David Miller wrote:
> >> From: Neil Horman <nhorman@...driver.com>
> >> Date: Sun, 26 Jul 2009 20:36:09 -0400
> >>
> >>   
> >>> 	Since Network devices dma their memory into a provided DMA
> >>> buffer (which can usually be at an arbitrary location, as they must
> >>> cross potentially several pci busses to reach any memory location),
> >>> I'm postulating that it would increase our receive path efficiency
> >>> to provide a hint to the driver layer as to which node to allocate
> >>> an skb data buffer on.  This hint would be determined by a feedback
> >>> mechanism.  I was thinking that we could provide a callback function
> >>> via the skb, that accepted the skb and the originating net_device.
> >>> This callback can track statistics on which numa nodes consume
> >>> (read: copy data from) skbs that were produced by specific net
> >>> devices.  Then, when in the future that netdevice allocates a new
> >>> skb (perhaps via netdev_alloc_skb), we can use that statistical
> >>> profile to determine if the data buffer should be allocated on the
> >>> local node, or on a remote node instead.
> >>>     
> >> No matter what, you will do an inter-node memory operation.
> >>
> >> Unless, the consumer NUMA node is the same as the one the
> >> device is on.
> >>
> >> Because since the device is on a NUMA node, if you DMA remotely
> >> you've eaten the NUMA cost already.
> >>
> >> If you always DMA to the device's NUMA node (what we try to do now) at
> >> least the is the possibility of eliminating cross-NUMA traffic.
> >>
> >> Better to move the application or stack processing towards the NUMA
> >> node the network device is on, I think.
> >>   
> > 
> > Is there an easy way to get this NUMA node from the application socket
> > descriptor?
> 
> Thats not easy, this information can change for every packet (think of
> bonding setups, whith aggregation of devices on different NUMA nodes)
> 
> We could add a getsockopt() call to peek this information from the next
> data to be read from socket (returns node id where skb data is sitting,
> hoping that NIC driver hadnt copybreak it (ie : allocate a small skb and
> copy the device provided data on it before feeding packet to network stack))
> 
Would a proc or debugfs interface perhaps be helpful here?  Something that
perhaps showed a statistical distribution of how many packets were received by
each process on each irq (operating under the assumption that each rx queue has
its own msi irq, giving us an easy identifier).

Neil

> 
> > Also, one question that was raised at the Linux Symposium is: how do you
> > know which processors run the receive queue for a specific connection ?
> > It would be nice to have a way to retrieve such information in the
> > application to avoid inter-node and inter-core/cache traffic.
> 
> All this depends on the fact you have multiqueue devices or not, and
> trafic spreads on all queues or not.
> 
> Assuming you have single queue device, only current way to handle
> this is to do the reverse thinking.
> 
> Ie, bind NIC interrupts to the appropriate set of cpus, and
> possibly bind user apps threads dealing with network trafic to same set.
> 
> Only background or cpu hungry threads should be allowed to run
> on foreigns nodes.
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ