[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120820185625.GA5234@redhat.com>
Date: Mon, 20 Aug 2012 21:57:26 +0300
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Or Gerlitz <ogerlitz@...lanox.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>, davem@...emloft.net,
roland@...nel.org, netdev@...r.kernel.org, ali@...lanox.com,
sean.hefty@...el.com, Erez Shitrit <erezsh@...lanox.co.il>,
Doug Ledford <dledford@...hat.com>
Subject: Re: [PATCH V2 09/12] net/eipoib: Add main driver functionality
On Sun, Aug 12, 2012 at 11:54:57PM +0300, Michael S. Tsirkin wrote:
> > and remember that
> > this code (VM through eipoib) can talk to any IPoIB element on the
> > fabric, native,
> > virtualized, HW/SW gateways, etc etc.
> >
> > Or.
>
> If you want this, then you really want a limited form of IPoIB bridging.
And to clarify that statement, here is how I would make such
IPoIB "bridging" work:
Guest side:
- Implement virtio-ipoib. This would be a device like virtio-net,
but instead of ethernet packets, it would pass packets
that consist of:
IPoIB destination address
IP packet
- this is passed to/from host without modifications, possibly with addition
of header such as virtio net header
- flags such as broadcast can also be added to header
- like virtio net get capabilities from host and expose
as netdev capabilities
Host side:
- create macvtap -passthrough like device that can sit on top of an
ipoib interface
- expose this device QPN and GID to guest as hardware address
- as we get packet forward it on UD QPN or CM as appropriate
depending on size,checksum and admin preference
- expose capabilities such as TSO
- can expose capability such as max MTU to guest too
Above means hardware address changes with migration.
So we need to notify guest when this happens.
This can be addressed from host by notifying all
neighbours.
Alternatively guest can notify all neighbours.
Notification can be done by broadcast.
This second option seems preferable.
this ipoib-vtap can support two modes
- bridge like mode:
guest to guest and guest to host packets
can be detected by macvtap and passed
to/from guest directly like macvlan bridge mode
- vepa like mode
guest to guest and guest to host packets
are sent out and looped back by IB switch
like macvlan vepa mode
As compared to the custom protocol I sent, it has -
Advantages: interoperates cleanly with ipoib
Disadvantages: no support for legacy (ethernet-only) guest
--
MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists