[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53335838.3060409@mojatatu.com>
Date: Wed, 26 Mar 2014 18:44:08 -0400
From: Jamal Hadi Salim <jhs@...atatu.com>
To: Florian Fainelli <f.fainelli@...il.com>,
Neil Horman <nhorman@...driver.com>
CC: Thomas Graf <tgraf@...g.ch>, Jiri Pirko <jiri@...nulli.us>,
netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Andy Gospodarek <andy@...yhouse.net>,
dborkman <dborkman@...hat.com>, ogerlitz <ogerlitz@...lanox.com>,
jesse <jesse@...ira.com>, pshelar <pshelar@...ira.com>,
azhou <azhou@...ira.com>, Ben Hutchings <ben@...adent.org.uk>,
Stephen Hemminger <stephen@...workplumber.org>,
jeffrey.t.kirsher@...el.com, vyasevic <vyasevic@...hat.com>,
Cong Wang <xiyou.wangcong@...il.com>,
John Fastabend <john.r.fastabend@...el.com>,
Eric Dumazet <edumazet@...gle.com>,
Scott Feldman <sfeldma@...ulusnetworks.com>,
Lennert Buytenhek <buytenh@...tstofly.org>,
Felix Fietkau <nbd@...nwrt.org>
Subject: Re: [patch net-next RFC 0/4] introduce infrastructure for support
of switch chip datapath
On 03/26/14 15:11, Florian Fainelli wrote:
> 2014-03-26 11:21 GMT-07:00 Neil Horman <nhorman@...driver.com>:
>> Yes, this is the point of contention, you're right. And you're also correct in
>> that we do have several devices that bypass the network stack on the. My
>> concern is that, in all of those cases its being bypassed because we know that
>> other software is handling that functionality (in the case of macvtap we know
>> that we're passing it off to a guest to be processed via the full network stack
>> available in the guest, and in the case of OVS, we know that we are passing
>> traffic to a software defined switch for handling). In the case of having a
>> switch fabric available, we're explicitly hiding the fact that traffic we are
>> passing between ports never touches the cpu, and that just rubs me the wrong
>> way. I suppose I'm looking at switch fabrics in the same way that I look at
>> TOE. In offloading forwaring functionality we remove from the cpu activity
>> which an administrator may reasonably expect to see handled in the cpu, but they
>> wont. In the case of macvlan, the admin knows thats a macvlan device, and
>> packet handling for frames bound to it occurs in the guest. for OVS, packets
>> recieved on the cpu with the proper encapsulation are clearly handled in the
>> OVS bridge. But in the case of a hardware switch, all they see are 4 net device
>> interfaces that seem like any other net device.
>
> Right, this is why Felix did not expose the switch ports as netdevices
> when he designed swconfig, because this would break the contract and
> assumptions that net_devices do actually transport data, and are not
> just used for control. It also made it easier to have a separate
> control path to expose the gazillion different configuration knobs
> that various switches offer...
>
Neil, I may be misreading your "TOE" semantis, but i think you view the
switch ports from a host prism. I am a middle box guy - I love it when
packets transiting through my box are offloaded. I can move more
bits/sec.
It is only TOE if the middle box is trying to do an end host function;->
OTOH, the owrt view is probably because (If i understood correctly
last time), there are cases where there is no way to even pass packets
and attribute them to the originating switch ports. Infact, in some
cases there may be no way at all to even pass packets to the kernel.
Did i understand that part correctly?
I suppose this is eventually all part of that capability discovery.
[..]
>
> Part of the problem is that you might start seeing actual relevant
> traffic on these per-port net_devices e.g: during software learning
> times, where traffic to specific ports will also be mirrored to the
> CPU port for lossless (or close to) traffic delivery, and then some
> software agent on the CPU will decide to bridge/bond/add vlans to some
> ports, and then we won't be seeing traffic again on these per-port
> net_devices for a while (in the context of switches supporting tags).
> As such, I'd rather treat those per-port net_devices as almost regular
> net_devices to allow that traffic to flow, even though this is not a
> permanent state.
>
A nod from here.
I think it would be useful to enumerate these types of devices
and what their control/data capability is.
cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists