[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <501C3527.6060809@mellanox.com>
Date: Fri, 03 Aug 2012 13:31:35 -0700
From: Ali Ayoub <ali@...lanox.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: Or Gerlitz <ogerlitz@...lanox.com>, davem@...emloft.net,
roland@...nel.org, netdev@...r.kernel.org, sean.hefty@...el.com,
Erez Shitrit <erezsh@...lanox.co.il>
Subject: Re: [PATCH V2 09/12] net/eipoib: Add main driver functionality
On 8/2/2012 10:15 AM, Eric W. Biederman wrote:
> Or Gerlitz <ogerlitz@...lanox.com> writes:
>
>> From: Erez Shitrit <erezsh@...lanox.co.il>
>>
>> The eipoib driver provides a standard Ethernet netdevice over
>> the InfiniBand IPoIB interface .
>>
>> Some services can run only on top of Ethernet L2 interfaces, and cannot be
>> bound to an IPoIB interface. With this new driver, these services can run
>> seamlessly.
>
> Do I read this code correctly that what you are doing is not tunneling
> ethernet over IB but instead you are removing an ethernet header and
> replacing it with an IB header?
Correct.
eIPoIB runs standard IPoIB on the wire, thus it doesn't encapsulate the
Ethernet frame on top of IPoIB, but rather translates it to an IPoIB
packet, this allows us to expose an Ethernet L2 network device, and
still keep interoperability with existing IPoIB endpoints. Running full
encapsulation (i.e. EoIPoIB) will break interoperability.
> Do I also read this code correctly if you can't find your destination
> mac address in your ""neighbor table"" you do a normal IPoIB arp
> for the infiniband GUID?
Correct.
Wire protocol remains IPoIB.
> Do I read this right that if presented with a non-IPv4 or ARP packet
> this code will do something undefined and unpredictable?
The current code drops IPv6 packets (see IS_E_IPOIB_PROTO), IPv6 support
will be added later on.
> Maybe this makes some sense but just skimming it looks like you
> are trying to force a square peg into a round hole resulting in
> some weird code and some very weird maintainability issues.
>
> I am honestly surprised at this approach. I would think it would be
> faster and simpler to run an IB queue pair directly to the hypervisor or
> possibly even the guest operating system bypassing the kernel and doing
> all of this translation in userspace.
With eIPoIB architecture, the VM sees standard Ethernet emulator,
allowing the administrator to enslave eIPoIB PIF to the vSwitch/vBridge
as if it was standard Ethernet. Other approaches that exposes IB QP to
the VM (with w/o bypassing the kernel) won't be possible with the
current emulators and management tools.
--
Ali;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists