lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 16 Jan 2010 02:53:15 -0800
From:	"Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@...el.com>
To:	David Miller <davem@...emloft.net>
CC:	"krkumar2@...ibm.com" <krkumar2@...ibm.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>
Subject: RE: ixgbe: [RFC] [PATCH] Fix return of invalid txq

>-----Original Message-----
>From: David Miller [mailto:davem@...emloft.net]
>Sent: Friday, January 15, 2010 1:06 AM
>To: Waskiewicz Jr, Peter P
>Cc: krkumar2@...ibm.com; netdev@...r.kernel.org; Kirsher, Jeffrey T
>Subject: Re: ixgbe: [RFC] [PATCH] Fix return of invalid txq
>
>From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>
>Date: Fri, 15 Jan 2010 01:00:20 -0800
>
>> What I've been thinking of is more for the NUMA allocations per port.
>> If we have, say 2 sockets, 8 cores a piece, then we have 16 CPUs.  If
>we
>> assign a port to socket 0, I think the best use of resources is to
>> allocate 8 Rx/Tx queues, one per core in that socket.  If an
>application
>> comes from the other socket, we can have a table to map the other 8
>> cores from that socket into the 8 queues, instead of piling them all
>> into one of the Tx queues.
>
>I fail to see how this can act substantially better than simply
>feeding traffic evenly amongst whatever group of queues have
>been configured.

On systems with a large number of cores, and depending on how a NIC is allocated to a set of processors, there is a difference.  If a port is allocated to socket three on a large NUMA system, then it could be CPUs 16-23 that it'd be using, and I'd have 8 Tx/Rx queues.  I can either use the easy approach Krishna has, which could then execute two minus operations, or I can setup a lookup table in the driver on open(), and then I make a single indexed lookup.  Using the lookup table makes the queue selection O(1) for whatever CPU/queue layout we come up with.

Either way works though.  I still think the table is the better way to go, because of the determinism for any system and NIC configuration/layout.  The overhead of configuring the table is taken during open(), so it's not in the hotpath at all.

Cheers,
-PJ
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists