[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150410191723.GC1277@obsidianresearch.com>
Date:	Fri, 10 Apr 2015 13:17:23 -0600
From:	Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To:	Doug Ledford <dledford@...hat.com>
Cc:	"ira.weiny" <ira.weiny@...el.com>,
	Michael Wang <yun.wang@...fitbricks.com>,
	Roland Dreier <roland@...nel.org>,
	Sean Hefty <sean.hefty@...el.com>, linux-rdma@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-nfs@...r.kernel.org,
	netdev@...r.kernel.org, Hal Rosenstock <hal.rosenstock@...il.com>,
	Tom Tucker <tom@...ngridcomputing.com>,
	Steve Wise <swise@...ngridcomputing.com>,
	Hoang-Nam Nguyen <hnguyen@...ibm.com>,
	Christoph Raisch <raisch@...ibm.com>,
	Mike Marciniszyn <infinipath@...el.com>,
	Eli Cohen <eli@...lanox.com>,
	Faisal Latif <faisal.latif@...el.com>,
	Upinder Malhi <umalhi@...co.com>,
	Trond Myklebust <trond.myklebust@...marydata.com>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	"David S. Miller" <davem@...emloft.net>,
	PJ Waskiewicz <pj.waskiewicz@...idfire.com>,
	Tatyana Nikolova <Tatyana.E.Nikolova@...el.com>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Jack Morgenstein <jackm@....mellanox.co.il>,
	Haggai Eran <haggaie@...lanox.com>,
	Ilya Nelkenbaum <ilyan@...lanox.com>,
	Yann Droneaud <ydroneaud@...eya.com>,
	Bart Van Assche <bvanassche@....org>,
	Shachar Raindel <raindel@...lanox.com>,
	Sagi Grimberg <sagig@...lanox.com>,
	Devesh Sharma <devesh.sharma@...lex.com>,
	Matan Barak <matanb@...lanox.com>,
	Moni Shoua <monis@...lanox.com>, Jiri Kosina <jkosina@...e.cz>,
	Selvin Xavier <selvin.xavier@...lex.com>,
	Mitesh Ahuja <mitesh.ahuja@...lex.com>,
	Li RongQing <roy.qing.li@...il.com>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Alex Estrin <alex.estrin@...el.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Erez Shitrit <erezsh@...lanox.com>,
	Tom Gundersen <teg@...m.no>,
	Chuck Lever <chuck.lever@...cle.com>
Subject: Re: [PATCH v2 01/17] IB/Verbs: Implement new callback
 query_transport() for each HW
On Fri, Apr 10, 2015 at 02:24:26PM -0400, Doug Ledford wrote:
> IPoIB is more than just an ULP.  It's a spec.  And it's very IB
> specific.  It will only work with OPA because OPA is imitating IB.
> To run it on another fabric, you would need more than just to make
> it work.  If the new fabric doesn't have a broadcast group, or has
> multicast registration like IB does, you need the equivalent of
> IBTA, whatever that may be for this new fabric, buy in on the
> pre-defined multicast groups and you might need firmware support in
> the switches.
It feels like the 'cap_ib_addressing' or whatever we call it captures
this very well. The IPoIB RFC is very much concerned with GID's and
MGID's and broadly requires the IBA addressing
scheme. cap_ib_addressing asserts the port uses that scheme. 
We wouldn't accept patches to IPoIB to add a new addressing scheme
without seeing proper diligence to the standards work.
Looking away from the stadards, using cap_XX seems very sane: We are
building a well defined system of invarients, You can't call into the
sa functions if cap_sa is not set, you can't call into the mcast
functions if cap_mcast is not set, you can't form a AH from IB
GIDs/MGID/LID without cap_ib_addressing.
I makes so much sense for the ULP to directly require the needed cap's
for the kernel APIs it intends to call, or not use the RDMA port at
all.
> > We can see how this might work in future, lets say OPAv2 *requires* the
> > 32 bit LID, for that case cap_ib_address = 0 cap_opa_address = 1. If
> > we don't update IPoIB and it uses the tests from above then it
> > immediately, and correctly, stops running on those OPAv2 devices.
> > 
> > Once patched to support cap_op_address then it will begin working
> > again. That seems very sane..
> 
> It is very sane from an implementation standpoint, but from the larger
> interoperability standpoint, you need that spec to be extended to the
> new fabric simultaneously.
I liked the OPAv2 hypothetical because it doesn't actually touch the
IPoIB spec. IPoIB spec has little to say about LIDs or LRHs it works
entirely at the GID/MGID/GRH level.
Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
