[<prev] [next>] [day] [month] [year] [list]
Message-ID: <54EF4F88.2070809@intel.com>
Date: Thu, 26 Feb 2015 08:53:28 -0800
From: John Fastabend <john.r.fastabend@...el.com>
To: Shrijeet Mukherjee <shrijeet@...il.com>
CC: Thomas Graf <tgraf@...g.ch>, Jiri Pirko <jiri@...nulli.us>,
Simon Horman <simon.horman@...ronome.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"nhorman@...driver.com" <nhorman@...driver.com>,
"andy@...yhouse.net" <andy@...yhouse.net>,
"dborkman@...hat.com" <dborkman@...hat.com>,
"ogerlitz@...lanox.com" <ogerlitz@...lanox.com>,
"jesse@...ira.com" <jesse@...ira.com>,
"jpettit@...ira.com" <jpettit@...ira.com>,
"joestringer@...ira.com" <joestringer@...ira.com>,
"jhs@...atatu.com" <jhs@...atatu.com>,
"sfeldma@...il.com" <sfeldma@...il.com>,
"f.fainelli@...il.com" <f.fainelli@...il.com>,
"roopa@...ulusnetworks.com" <roopa@...ulusnetworks.com>,
"linville@...driver.com" <linville@...driver.com>,
"gospo@...ulusnetworks.com" <gospo@...ulusnetworks.com>,
"bcrl@...ck.org" <bcrl@...ck.org>
Subject: Re: Flows! Offload them.
On 02/26/2015 07:51 AM, Shrijeet Mukherjee wrote:
>
>
> On Thursday, February 26, 2015, John Fastabend <john.r.fastabend@...el.com <javascript:_e(%7B%7D,'cvml','john.r.fastabend@...el.com');>> wrote:
>
> On 02/26/2015 07:25 AM, Shrijeet Mukherjee wrote:
> > However, for certain datacenter server use cases we actually have the
> > full user intent in user space as we configure all of the kernel
> > subsystems from a single central management agent running locally
> > on the server (OpenStack, Kubernetes, Mesos, ...), i.e. we do know
> > exactly what the user wants on the system as a whole. This intent is
> > then split into small configuration pieces to configure iptables, tc,
> > routes on multiple net namespaces (for example to implement VRF).
> >
> > E.g. A VRF in software would make use of net namespaces which holds
> > tenant specific ACLs, routes and QoS settings. A separate action
> > would fwd packets to the namespace. Easy and straight forward in
> > software. OTOH, the hardware, capable of implementing the ACLs,
> > would also need to know about the tc action which selected the
> > namespace when attempting to offload the ACL as it would otherwise
> > ACLs to wrong packets.
> >
> >
> > This is a new angle that I believe we have talked around in the context of user space policy, but not really considered.
> >
> > So the issue is what if you have a classifier and forward action which points to a device which the element doing the classification does not have access to right ?
> >
> > This problem obliquely showed up in the context of route table entries not in the "external" table but present in the software tables as well.
> >
> > Maybe the scheme requires an implicit "send to software" device which then diverts traffic to the right place ? Would creating an implicit, un-offload device address these concerns ?
>
> So I think there is a relatively simple solution for this. Assuming
> I read the description correctly namely packet ingress' nic/switch
> and you want it to land in a namespace.
>
> Today we support offloaded macvlan's and SR-IOV. What I would expect
> is user creates a set of macvlan's that are "offloaded" this just means
> they are bound to a set of hardware queues and do not go through the
> normal receive path. Then assigning these to a namespace is the same
> as any other netdev.
>
> Hardware has an action to forward to "VSI" (virtual station interface)
> which matches on a packet and forwards it to either a VF or set of
> queues bound to a macvlan. Or you can do the forwarding using standards
> based protocols such as EVB (edge virtual bridging).
>
> So its a simple set of steps with the flow api,
>
> 1. create macvlan with dfwd_offload set
> 2. push netdev into namespace
> 3. add flow rule to match traffic and send to VSI
> ./flow -i ethx set_rule match xyz action fwd_vsi 3
>
> The VSI# is reported by ip link today its a bit clumsy so that interface
> could be cleaned up.
>
> Here is a case where trying to map this onto a 'tc' action in software
> is a bit awkward and you convoluted what is really a simple operation.
> Anyways this is not really an "offload" in the sense that your taking
> something that used to run in software and moving it 1:1 into hardware.
> Adding SR-IOV/VMDQ support requries new constructs. By the way if you
> don't like my "flow" tool and you want to move it onto "tc" that could
> be done as well but the steps are the same.
>
> .John
>
>
> +1
>
> That is the un-offload device I was referencing. If we standardize
> and implicitly make the available .. all packets that are needing to
> be sent to a construct that is not readily availble in hardware goes
> to this VSI and then software fwded. I am saying though that when
> this path is invoked the path after the VSI is not offloaded.
Right, also the VSI may be the endpoint of the traffic. It could be
a VM for example or an application that is using the TCAM to offload
classification and data structures that are CPU expensive. In these
examples there is no software fwd path.
.John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists