[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZOPZL+hAQR_7MUGy9kLCzoLY7hbDwEGGodiMxc=yfPgmi8bQ@mail.gmail.com>
Date: Tue, 4 Sep 2012 21:57:01 +0300
From: Or Gerlitz <or.gerlitz@...il.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Or Gerlitz <ogerlitz@...lanox.com>, davem@...emloft.net,
roland@...nel.org, netdev@...r.kernel.org, sean.hefty@...el.com,
Erez Shitrit <erezsh@...lanox.co.il>,
Ali Ayoub <ali@...lanox.com>,
Doug Ledford <dledford@...hat.com>
Subject: Re: [PATCH V2 09/12] net/eipoib: Add main driver functionality
On Tue, Sep 4, 2012 at 12:22 AM, Michael S. Tsirkin <mst@...hat.com> wrote:
> On Mon, Sep 03, 2012 at 11:53:56PM +0300, Or Gerlitz wrote:
>> Michael S. Tsirkin <mst@...hat.com> wrote:
>> > [...] so it seems that a sane solution would involve an extra level of
>> > indirection, with guest addresses being translated to host IB addresses.
>> > As long as you do this, maybe using an ethernet frame format makes sense.
>>
>> > So far the things that make sense. Here are some that don't, to me:
>>
>> > - Is a pdf presentation all you have in terms of documentation?
>> > We are talking communication protocols here - I would expect a
>> > proper spec, and some effort to standardize, otherwise where's the
>> > guarantee it won't change in an incompatible way?
>> > Other things that I would expect to be addressed in such a spec is
>> > interaction with other IPoIB features, such as connected
>> > mode, checksum offloading etc, and IB features such as multipath etc.
>> >
>> > - The way you encode LID/QPN in the MAC seems questionable. IIRC there's
>> > more to IB addressing than just the LID. Since everyone on the subnet
>> > need access to this translation, I think it makes sense to store it in
>> > the SM. I think this would also obviate some IPv4 specific hacks in kernel.
>>
>> > - IGMP/MAC snooping in a driver is just too hairy.
>> > As you point out, bridge currently needs the uplink in promisc mode.
>> > I don't think a driver should work around that limitation.
>> > For some setups, it might be interesting to remove the promisc
>> > mode requirement, failing that, I think you could use macvtap passthrough.
>> >
>> > - Currently migration works without host kernel help, would be
>> > preferable to keep it that way.
>> If we rewind to this point, basically, you had few concerns
> I think some other people gave feedback too, you need to address it in
> the patch (as opposed to by mail - even if it's in documentation or
> comments) don't just focus on what I wrote.
The other feedback was:
1. suggesting to introduce new link layer for the para-virtualized
network stack, a direction pointed by Dave and you, for which I
responded that it doesn't address a hard requirement I got, which is
provide service to an arbitrary VM which have any OS and a virtual
Ethernet NIC on, emulated or para-virtualized.
2. Suggestions to use solutions which involve routing and/or
proxt-arp, for which I responded that in most/practical cases this
will require for the host to know the VM IP, something which isn't
valid assumption in many cases AND that bunch of cloud stacks,
specifically the leading ones, don't even support that option, which
also violates a hard requirement we got, to support these stacks which
are in use by customers.
3. Suggestions to invent EoIB -- I said, OK but this is long termish
process that we're looking on, might be around at some future point,
but still we are required to support whole echo systems which use
IPoIB, and there's no point to "route" between EoIB segment to IPoIB
segments, its back to #2 which we didn't accept
4. Other feedback saying the eIPoIB driver is messy and "we don't like
it" -- hard to exactly address in code changes
----> all in all, we were suggested few directions which don't allow
us to address the problem statement, so there's no way to change the
code so they are fullfilled, and some not concrete comments which are
also hard to address --> we took the route of design changes along
your **concrete** comments, doesn't it make sense?
Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists