lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Nov 2011 11:52:51 -0800
From:	Justin Pettit <jpettit@...ira.com>
To:	Stephen Hemminger <shemminger@...tta.com>
Cc:	jhs@...atatu.com, hadi@...erus.ca, Jesse Gross <jesse@...ira.com>,
	netdev <netdev@...r.kernel.org>, dev@...nvswitch.org,
	David Miller <davem@...emloft.net>,
	Chris Wright <chrisw@...hat.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Eric Dumazet <eric.dumazet@...il.com>,
	John Fastabend <john.r.fastabend@...el.com>
Subject: Re: Open vSwitch Design

On Nov 24, 2011, at 9:20 PM, Stephen Hemminger wrote:

>> This can be achieved easily with zero changes to the kernel code.
>> You need to have default filters that redirect flows to user space
>> when you fail to match.
> 
> Actually, this is what puts me off on the current implementation.
> I would prefer that the kernel implementation was just a software
> implementation of a hardware OpenFlow switch. That way it would
> be transparent that the control plane in user space was talking to kernel
> or hardware.

A big difficulty is finding an appropriate hardware abstraction.  I've worked on porting Open vSwitch to a few different vendors' switching ASICs, and they've all looked quite different from each other.  Even within a vendor, there can be fairly substantial differences.  Packet processing is broken up into stages (e.g., VLAN preprocessing, ingress ACL processing, L2 lookup, L3 lookup, packet modification, packet queuing, packet replication, egress ACL processing, etc.) and these can be done in different orders and have quite different behaviors.  Also, the size of the various tables varies widely between ASICs--even within the same family.

Hardware typically makes use of TCAMs, which support fast lookups of wildcarded flows.  They're expensive, though, so they're typically limited to entries in the very low thousands.  In software, we can trivially store 100,000s of entries, but supporting wildcarded lookups is very slow.  If we only use exact-match flows in the kernel (and leave the wildcarding in userspace for kernel misses), we can do extremely fast lookups with hashing on what becomes the fastpath.

Using exact-match entries has another big advantage: we can innovate the userspace portion without requiring changes to the kernel.  For example, we recently went from supporting a single OpenFlow table to 255 without any kernel changes.  This has an added benefit that a flow requiring multiple table lookups becomes a single hash lookup in the kernel, which is a huge performance gain in the fastpath.  Another example is our introduction of a number of metadata "registers" between tables that are never seen in the kernel, but open up a lot of interesting applications for OpenFlow controller writers.

If you're interested, we include a porting guide in the distribution that describes how one would go about bringing Open vSwitch to a new hardware or software platform:

	http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=blob;f=PORTING

Obviously, it's not that relevant here, since there's already a port to Linux.  :-)  But we've iterated over a few different designs and worked on other ports, and we've found this hardware/software abstraction layer to work pretty well.  In fact, multiple ports of Open vSwitch have been done by name-brand third party vendors (this is the avenue most vendors use to get their OpenFlow support) and are now shipping.

We're always open to discussing ways that we can improve this interfaces, too, of course!

--Justin


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ