[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54F80815.5030208@gmail.com>
Date: Wed, 04 Mar 2015 23:39:01 -0800
From: John Fastabend <john.fastabend@...il.com>
To: David Miller <davem@...emloft.net>
CC: therbert@...gle.com, davidch@...adcom.com,
simon.horman@...ronome.com, dev@...nvswitch.org,
netdev@...r.kernel.org, pablo@...filter.org
Subject: Re: [ovs-dev] OVS Offload Decision Proposal
On 03/04/2015 10:42 PM, David Miller wrote:
> From: Tom Herbert <therbert@...gle.com>
> Date: Wed, 4 Mar 2015 21:20:41 -0800
>
>> On Wed, Mar 4, 2015 at 9:00 PM, David Miller <davem@...emloft.net> wrote:
>>> From: John Fastabend <john.fastabend@...il.com>
>>> Date: Wed, 04 Mar 2015 17:54:54 -0800
>>>
>>>> I think a set operation _is_ necessary for OVS and other
>>>> applications that run in user space.
>>>
>>> It's necessary for the kernel to internally manage the chip
>>> flow resources.
>>>
>>> Full stop.
>>>
>>> It's not being exported to userspace. That is exactly the kind
>>> of open ended, outside the model, crap we're trying to avoid
>>> by putting everything into the kernel where we have consistent
>>> mechanisms, well understood behaviors, and rules.
>>
>> David,
>>
>> Just to make sure everyone is on the same page... this discussion has
>> been about where the policy of offload is implemented, not just who is
>> actually sending config bits to the device. The question is who gets
>> to decide how to best divvy up the finite resources of the device and
>> network amongst various requestors. Is this what you're referring to?
>
> I'm talking about only the kernel being able to make ->set() calls
> through the flow manager API to the device.
>
> Resource control is the kernel's job.
>
> You cannot delegate this crap between ipv4 routing in the kernel,
> L2 bridging in the kernel, and some user space crap. It's simply
> not going to happen.
The intent was to reserve space in the tables for l2, l3, user space,
and whatever else is needed. This reservation needs to come from the
administrator because even the kernel doesn't know how much of my
table space I want to reserve for l2 vs l3 vs tc vs ... The sizing
of each of these tables will depend on the use case. If I'm provisioning
L3 networks I may want to create a large l3 table and no 'tc' table.
If I'm building a firewall box I might want a small l3 table and a
large 'tc' table. Also depending on how wide I want my matches in the
'tc' case I may consume more or less resources in the hardware.
Once the reservation of resources occurs we wouldn't let user space
arbitrarily write to any table but only tables that have been
explicitly reserved for user space to write to.
Even without the user space piece we need this reservation when
the table space for l2, l3, etc are shared. Otherwise driver writers
end up doing a best guess for you or end up delivering driver flavours
based on firmware and you can hope the driver writer guessed something
that is close to your network.
>
> All of the delegation of the hardware resource must occur in the
> kernel. Because only the kernel has a full view of all of the
> resources and how each and every subsystem needs to use it.
>
So I'm going to ask... even if we restrict the set() using the above
scheme to only work on pre-defined tables you see an issue with it?
I might be missing the point but I could similarly drive the set()
calls through 'tc' via a new filter call it xflow.
.John
--
John Fastabend Intel Corporation
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists