[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <AC3C5F7C-E515-48FC-AA76-E7199C6FFB51@cumulusnetworks.com>
Date: Wed, 2 Apr 2014 08:37:28 -0700
From: Scott Feldman <sfeldma@...ulusnetworks.com>
To: Jiri Pirko <jiri@...nulli.us>
Cc: Roopa Prabhu <roopa@...ulusnetworks.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
Florian Fainelli <f.fainelli@...il.com>,
Neil Horman <nhorman@...driver.com>,
Thomas Graf <tgraf@...g.ch>, netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Andy Gospodarek <andy@...yhouse.net>,
dborkman <dborkman@...hat.com>, ogerlitz <ogerlitz@...lanox.com>,
jesse <jesse@...ira.com>, pshelar <pshelar@...ira.com>,
azhou <azhou@...ira.com>, Ben Hutchings <ben@...adent.org.uk>,
Stephen Hemminger <stephen@...workplumber.org>,
Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
vyasevic <vyasevic@...hat.com>,
Cong Wang <xiyou.wangcong@...il.com>,
John Fastabend <john.r.fastabend@...el.com>,
Eric Dumazet <edumazet@...gle.com>,
Lennert Buytenhek <buytenh@...tstofly.org>,
Shrijeet Mukherjee <shm@...ulusnetworks.com>
Subject: Re: [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath
On Apr 1, 2014, at 11:41 PM, Jiri Pirko <jiri@...nulli.us> wrote:
> Tue, Apr 01, 2014 at 09:13:00PM CEST, sfeldma@...ulusnetworks.com wrote:
>>
>> On Mar 26, 2014, at 11:03 AM, Jiri Pirko <jiri@...nulli.us> wrote:
>>
>>> Wed, Mar 26, 2014 at 06:47:15PM CET, roopa@...ulusnetworks.com wrote:
>>>> On 3/26/14, 9:59 AM, Jiri Pirko wrote:
>>>>> Wed, Mar 26, 2014 at 05:54:17PM CET, roopa@...ulusnetworks.com wrote:
>>>>> So you implement bonding netlink api? Or you hook into bonding driver
>>>>> itselt? Can you show us the code?
>>>> We use the netlink API and libnl. In our current model, our switch
>>>> chip driver listens to netlink notifications and programs the switch
>>>> chip. The switch chip driver uses libnl caches and libnl netlink apis
>>>> to reflect the kernel state to switch chip.
>>>
>>>
>>> So when you configure for example bonding over 2 ports, you actually use
>>> bonding driver to do that. And you userspace app listens to
>>> notifications and programs the switch chip accordingly. Am I close?
>>>
>>> How about data? Is this new "bonding" interface able to assign ip to is
>>> and send/receive packets.
>>>
>>> I'm still not sure I understand your concept. Do you have some
>>> documentation for it available?
>>
>> Actually Jiri this is the code you and I worked on recently to netlink-ify bonding/slave attributes and active/inactive notification. You have it right, user uses normal ip link tools and bonding driver to create bond, set attributes, and enslave switch ports. RTM_NEWLINK is used to program ASIC to offload LAG to HW. RTM_NEWLINK msgs contains bond attributes (mode, etc) and slave list, as well as slave status. This is enough information to program ASIC. Once programmed, ASIC offloads the data plane traffic, and in the case of egress, handles the LAG hash distribution. Only the LACP control plane traffic makes it to the bonding driver; data plane traffic does not make it to the bonding driver.
>>
>> So, not trying to sound like a smart-ass, but the documentation is the bonding driver, specifically the netlink attributes/notifications.
>
> Ok, so no additional kernel code for this?
The kernel is rich with netlink. Bonds, bridges, vlans, vxlans, L3 route tables, flow tables, neigh table, addr table, and the list goes on, all give up their info via netlink. Add in a simple netdev-based abstraction for switch ports (such as yours) and you have everything (well, almost, the devil is in the details) you need to program switch chips. From netlink you get the mgmt plane to HW offload the data plane from the kernel.
> Only some user pace agent programming the chip?
If using netlink, the agent programming the chip can live in the kernel (preferred) or user-space. Netlink listener can reside either place, since it’s a multicast bus.
-scott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists