lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 07 Oct 2011 14:37:47 -0400
From:	starlight@...nacle.cx
To:	chetan loke <loke.chetan@...il.com>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	linux-kernel@...r.kernel.org, netdev <netdev@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Lameter <cl@...two.org>, Willy Tarreau <w@....eu>,
	Ingo Molnar <mingo@...e.hu>,
	Stephen Hemminger <stephen.hemminger@...tta.com>,
	Benjamin LaHaise <bcrl@...ck.org>,
	Joe Perches <joe@...ches.com>, lokechetan@...il.com,
	Con Kolivas <conman@...ivas.org>,
	Serge Belyshev <belyshev@...ni.sinp.msu.ru>
Subject: Re: big picture UDP/IP performance question re 2.6.18
  -> 2.6.32

At 02:09 PM 10/7/2011 -0400, chetan loke wrote:
>I'm a little confused. Seems like there are
>conflicting goals. If you want to bypass the
>kernel-protocol-stack then you have the following
>options: a) kernel af_packet. This is where we
>would get a chance to test all the kernel features
>etc.

Perhaps I haven't been sufficiently clear.
The "packet socket" mode I refer to in the
earlier post was using AF/PF_PACKET mode sockets
as in

   socket(PF_PACKET, SOCK_RAW, eth_p_all);

Have run it in both normal and memory mapped
modes.  MMAP mode is a slight bit more expensive
due to the cache pressure from the additional
copy.  On the 6174 MMAP seems to be a smidgen
better in certain tests, but in the end both
read() and mapped approaches are effectively
identical on performance--and generally match
the cost of UDP sockets almost exactly.

b) Use non-commodity(?) NICs(from vendors
>you mentioned): where it might have some on-board
>memory(cushion) and so it can absorb the spikes
>and can also smoothen out too many
>PCI-transactions for bursty (and small payload -
>as in 64 byte traffic). But wait, when you use the
>libs provided by these vendors, then their
>driver(especially the Rx path) is not so much
>working in inline mode as NIC drivers in case a)
>above. This driver with a special Rx-path purely
>exists for managing your mmap'd queues.So
>of-course it's going to be faster that the
>traditional inline drivers. In this partial-inline
>mode, the adapter might i) batch the packets and
>ii) send a single notification to the
>host-side. With that single event you are now
>processing 1+ packets.

Kernel bypass is probably the best answer for
what we do.  Problem has been lack of maturity
in their driver software.  Looks like it's reaching
a point where they cover our use case.  As mentioned
earlier, Solarflare could not match the Intel
82599 + ixgbe for this app last year.  Was a
disaster.  Myricom is focused on UDP (better
for us), but only just added multi-core IRQ
doorbell wakeups in recent months.  Previously
one had to accept all IRQs on a single core or
poll, neither of which works for us.

>You got it. In case of tilera there are two modes:
>tile-cpu in device mode: beats most of the
>non-COTS NICs. It runs linux on the adapter
>side. Imagine having the flexibility/power to
>program the ASIC using your favorite OS. Its
>orgasmic. So go for it!  tile-cpu in host-mode:
>Yes, it could be a game changer.

We almost went for the 1st gen Tile64 outboard
NIC approach, but were concerned about whether
they would survive--still are.  Intel has
crushed more than a few competitors along
the way.  If Google or Facebook buys into the
Tile-Gx it becomes a safe choice overnight.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists