[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <65CD6CF6-C50F-47F7-85DB-12D480FA4712@mellanox.com>
Date: Mon, 18 Feb 2019 19:00:19 +0000
From: Yossi Kuperman <yossiku@...lanox.com>
To: Guy Shattah <sguy@...lanox.com>, Aaron Conole <aconole@...hat.com>,
John Hurley <john.hurley@...ronome.com>,
Simon Horman <simon.horman@...ronome.com>,
Justin Pettit <jpettit@....org>,
Gregory Rose <gvrose8192@...il.com>,
Eelco Chaudron <echaudro@...hat.com>,
Flavio Leitner <fbl@...hat.com>,
Florian Westphal <fwestpha@...hat.com>,
Jiri Pirko <jiri@...nulli.us>, Rashid Khan <rkhan@...hat.com>,
Sushil Kulkarni <sukulkar@...hat.com>,
Andy Gospodarek <andrew.gospodarek@...adcom.com>,
Roi Dayan <roid@...lanox.com>,
Yossi Kuperman <yossiku@...lanox.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Rony Efraim <ronye@...lanox.com>,
"davem@...emloft.net" <davem@...emloft.net>,
Marcelo Leitner <mleitner@...hat.com>,
Paul Blakey <paulb@...lanox.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Yet another approach for implementing connection tracking offload
Hello All,
Following is a description of yet another possible approach to implement connection tracking offload. We would like to hear your opinion. There is the “native” way of implementing such an offload by mirroring the software tables to hardware. This way seems straightforward and simple, but real life is much more complicated than that. Alternatively, we can merge the data-path flows (separated by recirc_id) and offload a single flow to hardware.
The general idea is quite simple. When OVS-daemon configures TC with a filter that recirculate, the driver merely pretends to offload it and return success. Upon packet arrival (in software) we let it traverse TC as usual, except for now we notify the driver on each successful match. By doing this, the driver has all the necessary information to merge the participating flows---including connection tracking 5-tuple---into one equivalent flow. We do such a merge and offload only if the connection is established. Note: the same mechanism to communicate a 5-tuple to the driver can be used to notify on a filter match.
It is the driver responsibility to build and maintain the list of filters a (specific) packet hit along the TC walk. Once we reach the last filter (a terminating one, e.g., forward) the driver posts a work on a dedicated work-queue. In this work-queue context, we merge the participating filters and create a new filter that is logically equal (match + actions). The merge itself is not complicated as it might seems—TC does all the heavy lifting, this is not a random list of filters. At this point, we configure the hardware with one filter, either we have a match and the packet is handled by the hardware, or we don’t and the packet goes to software unmodified.
Going along this path we must tackle two things: 1) counters and 2) TC filter deletion. 1) We must maintain TC counters as the user expect. Each merged filter holds a list of filters it is derived from, parents. Once an update is available for a merged filter counter, the driver must update the corresponding parents appropriately. 2) Upon TC filer deletion it is mandatory to remove all the derived (merged) filters from the hardware as consequence.
Pros & Cons
Pros: 1) Circumvent the complexity involved with continuation in software where the hardware left off. 2) Simplifies the hardware pipeline with only one filter and might improve the overall performance.
Cons: 1) Only applicable to OVS-oriented filters, will not support priorities and overlapping filters. 2) Merger logic might consume CPU cycles which might impact the rate of filters we can offload. However, this overhead is believed to be negligible, if implemented carefully. 3) Requires TC/flower to notify the driver on each filter match (that is the only change needed above the driver).
Both approaches share the same software model, most of the code above the driver is shared. This approach can be considered temporary until the hardware will mature.
What do you think about this approach?
If something is not clear please let me know and I will do my best to clarify.
Cheers,
Kuperman
Powered by blists - more mailing lists