lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DB4PR05MB0863E9CCCD1BEA34F244EE2CB1EF0@DB4PR05MB0863.eurprd05.prod.outlook.com>
Date:	Tue, 21 Apr 2015 23:36:40 +0000
From:	Liran Liss <liranl@...lanox.com>
To:	Michael Wang <yun.wang@...fitbricks.com>,
	Roland Dreier <roland@...nel.org>,
	Sean Hefty <sean.hefty@...el.com>,
	Hal Rosenstock <hal.rosenstock@...il.com>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"hal@....mellanox.co.il" <hal@....mellanox.co.il>
CC:	Tom Tucker <tom@...ngridcomputing.com>,
	Steve Wise <swise@...ngridcomputing.com>,
	Hoang-Nam Nguyen <hnguyen@...ibm.com>,
	"raisch@...ibm.com" <raisch@...ibm.com>,
	Mike Marciniszyn <infinipath@...el.com>,
	Eli Cohen <eli@...lanox.com>,
	Faisal Latif <faisal.latif@...el.com>,
	Jack Morgenstein <jackm@....mellanox.co.il>,
	"Or Gerlitz" <ogerlitz@...lanox.com>,
	Haggai Eran <haggaie@...lanox.com>,
	"Ira Weiny" <ira.weiny@...el.com>, Tom Talpey <tom@...pey.com>,
	Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
	Doug Ledford <dledford@...hat.com>
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

Hi Michael,

The spirit of this patch-set is great, but I think that we need to clarify some concepts.
Since this will affect the whole patch-set, I am laying out my concerns here instead.

A suggestion for the resulting management helpers is given below.
I believe the result would be much more coherent.
--Liran

In general
========

An ib_dev (or a port of) should be distinguished by 3 qualifiers:
- The link layer:
-- Ethernet (shared by iWARP, USNIC, and ROCE)
-- Infiniband

- The transport (*)
-- IBTA transport (shared by IB and ROCE)
-- iWARP transport
-- USNIC transport

(*) Transport means both:
- The L4 wire protocols (e.g., BTH+ headers of IBTA, optionally encapsulated by UDP in ROCEv2, or the iWARP stack)
- The transport semantics (for example, there are slight semantic differences between IBTA and iWARP)

- The node type (**)
-- CA
-- Switch
-- Router

(**) This has been extended to also encode the transport in the current code.
At least for user-space visible APIs, we might chose to leave this for backward compatibility, but we can consider cleaning up the kernel code.

So, I think that our "old-transport" below is just fine.
No need to change it (and you aren't, since it is currently implemented as a function).

The "new-transport" does not really exist, but is broken into several capability checks of the L4 transport, optionally with conditions on the link type.
I would remove the table below and tell what we really want to achieve:
==> move technology-specific feature-check logic out of the (multiple!) IB code components and various ULPs into per-feature helpers.


Detailed remarks
==============

1) The introduction of cap_*_*() stuff should have been introduced directly in patch 02/27.
This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and increases the number of patches in the patch-set.
Do this and remove patches 16-24.

2)The name rdma_tech_* is lame.
rdma_transport_*(), adhering to the above (*) remark, is much better.
For example, both IB and ROCE *do* use the same transport. 

3) The name cap_* as it is used above is not accurate.
You use it to describe technology characteristics rather than extendable capabilities.
I would suggest having a single convention for all helpers, such as rdma_has_*() and rdma_is_*().
For example: cap_ib_smi() ==> rdma_has_smi().

4) Remove all capabilities that do not introduce any distinction in the current code.
We can add them as needed later.
This means remove patches:
- [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all IB devices support ipoib
- [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB devices support AF_IB.

On the other hand:
- rdma_has_multicast() makes sense, since iWARP doesn’t support it.
- cap_ib_sa() might make sense to cut code even further in the CMA, since RoCE has a GSI but no SA.

5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
It *is* the link layer!

6) Remove cap_read_multi_sge
It is not device/port feature, but a transport capability.
Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in 'enum ib_device_cap_flags'.
 
7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah().
Address handles that refer to Ethernet links always have Ethernet addressing.

In the CMA code, using rdma_tech_iboe() is just fine. This is how you define cap_eth_ah() anyway.
Currently, this patch just adds clutter.

8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
We do need a transport qualifier, as exemplified in comment 5) above, and for a complete clean model.
This is after renaming the function to rdma_is_ib_transport()...


Putting it all together
==================

We are left with the following helpers:
- rdma_is_ib_transport()
- rdma_is_iwarp_transport()
- rdma_is_usnic_transport()
- rdma_is_iboe()
- rdma_has_mad()
- rdma_has_smi()
- rdma_has_gsi() - complements smi; can be used by the mad code for clarity
- rdma_has_sa()
- rdma_has_cm()
- rdma_has_mcast()


> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> 
> 
> Since v4:
>   * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>     Roland, Ira and Steve :-) Please remind me if anything missed :-P
>   * Fix logical issue inside 3#, 14#
>   * Refine 3#, 4#, 5# with label 'free'
>   * Rework 10# to stop using port 1 when port already assigned
> 
> There are plenty of lengthy code to check the transport type of IB device, or
> the link layer type of it's port, but actually we are just speculating whether a
> particular management/feature is supported by the device/port.
> 
> Thus instead of inferring, we should have our own mechanism for IB
> management capability/protocol/feature checking, several proposals below.
> 
> This patch set will reform the method of getting transport type, we will now
> using query_transport() instead of inferring from transport and link layer
> respectively, also we defined the new transport type to make the concept
> more reasonable.
> 
> Mapping List:
> 		node-type	link-layer	old-transport	new-transport
> nes		RNIC		ETH		IWARP		IWARP
> amso1100	RNIC		ETH		IWARP		IWARP
> cxgb3   	RNIC		ETH		IWARP		IWARP
> cxgb4   	RNIC		ETH		IWARP		IWARP
> usnic   	USNIC_UDP	ETH		USNIC_UDP	USNIC_UDP
> ocrdma  	IB_CA		ETH		IB		IBOE
> mlx4    	IB_CA		IB/ETH		IB		IB/IBOE
> mlx5    	IB_CA		IB		IB		IB
> ehca    	IB_CA		IB		IB		IB
> ipath   	IB_CA		IB		IB		IB
> mthca   	IB_CA		IB		IB		IB
> qib     	IB_CA		IB		IB		IB
> 
> For example:
> 	if (transport == IB) && (link-layer == ETH) will now become:
> 	if (query_transport() == IBOE)
> 
> Thus we will be able to get rid of the respective transport and link-layer
> checking, and it will help us to add new protocol/Technology (like OPA) more
> easier, also with the introduced management helpers, IB management logical
> will be more clear and easier for extending.
> 
> Highlights:
>     The patch set covered a wide range of IB stuff, thus for those who are
>     familiar with the particular part, your suggestion would be invaluable ;-)
> 
>     Patch 1#~15# included all the logical reform, 16#~25# introduced the
>     management helpers, 26#~27# do clean up.
> 
>     Patches haven't been tested yet, we appreciate if any one who have these
>     HW willing to provide his Tested-by :-)
> 
>     Doug suggested the bitmask mechanism:
> 	https://www.mail-archive.com/linux-
> rdma@...r.kernel.org/msg23765.html
>     which could be the plan for future reforming, we prefer that to be another
>     series which focus on semantic and performance.
> 
>     This patch-set is somewhat 'bloated' now and it may be a good timing for
>     staging, I'd like to suggest we focus on improving existed helpers and push
>     all the further reforms into next series ;-)
> 
> Proposals:
>     Sean:
> 	https://www.mail-archive.com/linux-
> rdma@...r.kernel.org/msg23339.html
>     Doug:
> 	https://www.mail-archive.com/linux-
> rdma@...r.kernel.org/msg23418.html
> 	https://www.mail-archive.com/linux-
> rdma@...r.kernel.org/msg23765.html
>     Jason:
> 	https://www.mail-archive.com/linux-
> rdma@...r.kernel.org/msg23425.html
> 
> Michael Wang (27):
>     IB/Verbs: Implement new callback query_transport()
>     IB/Verbs: Implement raw management helpers
>     IB/Verbs: Reform IB-core mad/agent/user_mad
>     IB/Verbs: Reform IB-core cm
>     IB/Verbs: Reform IB-core sa_query
>     IB/Verbs: Reform IB-core multicast
>     IB/Verbs: Reform IB-ulp ipoib
>     IB/Verbs: Reform IB-ulp xprtrdma
>     IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>     IB/Verbs: Reform cm related part in IB-core cma/ucm
>     IB/Verbs: Reform route related part in IB-core cma
>     IB/Verbs: Reform mcast related part in IB-core cma
>     IB/Verbs: Reserve legacy transport type in 'dev_addr'
>     IB/Verbs: Reform cma_acquire_dev()
>     IB/Verbs: Reform rest part in IB-core cma
>     IB/Verbs: Use management helper cap_ib_mad()
>     IB/Verbs: Use management helper cap_ib_smi()
>     IB/Verbs: Use management helper cap_ib_cm()
>     IB/Verbs: Use management helper cap_iw_cm()
>     IB/Verbs: Use management helper cap_ib_sa()
>     IB/Verbs: Use management helper cap_ib_mcast()
>     IB/Verbs: Use management helper cap_ipoib()
>     IB/Verbs: Use management helper cap_read_multi_sge()
>     IB/Verbs: Use management helper cap_af_ib()
>     IB/Verbs: Use management helper cap_eth_ah()
>     IB/Verbs: Clean up rdma_ib_or_iboe()
>     IB/Verbs: Cleanup rdma_node_get_transport()
> 
> ---
>  drivers/infiniband/core/agent.c              |    4
>  drivers/infiniband/core/cm.c                 |   26 +-
>  drivers/infiniband/core/cma.c                |  328 ++++++++++++---------------
>  drivers/infiniband/core/device.c             |    1
>  drivers/infiniband/core/mad.c                |   51 ++--
>  drivers/infiniband/core/multicast.c          |   18 -
>  drivers/infiniband/core/sa_query.c           |   41 +--
>  drivers/infiniband/core/sysfs.c              |    8
>  drivers/infiniband/core/ucm.c                |    5
>  drivers/infiniband/core/ucma.c               |   27 --
>  drivers/infiniband/core/user_mad.c           |   32 +-
>  drivers/infiniband/core/uverbs_cmd.c         |    6
>  drivers/infiniband/core/verbs.c              |   33 --
>  drivers/infiniband/hw/amso1100/c2_provider.c |    7
>  drivers/infiniband/hw/cxgb3/iwch_provider.c  |    7
>  drivers/infiniband/hw/cxgb4/provider.c       |    7
>  drivers/infiniband/hw/ehca/ehca_hca.c        |    6
>  drivers/infiniband/hw/ehca/ehca_iverbs.h     |    3
>  drivers/infiniband/hw/ehca/ehca_main.c       |    1
>  drivers/infiniband/hw/ipath/ipath_verbs.c    |    7
>  drivers/infiniband/hw/mlx4/main.c            |   10
>  drivers/infiniband/hw/mlx5/main.c            |    7
>  drivers/infiniband/hw/mthca/mthca_provider.c |    7
>  drivers/infiniband/hw/nes/nes_verbs.c        |    6
>  drivers/infiniband/hw/ocrdma/ocrdma_main.c   |    1
>  drivers/infiniband/hw/ocrdma/ocrdma_verbs.c  |    6
>  drivers/infiniband/hw/ocrdma/ocrdma_verbs.h  |    3
>  drivers/infiniband/hw/qib/qib_verbs.c        |    7
>  drivers/infiniband/hw/usnic/usnic_ib_main.c  |    1
>  drivers/infiniband/hw/usnic/usnic_ib_verbs.c |    6
>  drivers/infiniband/hw/usnic/usnic_ib_verbs.h |    2
>  drivers/infiniband/ulp/ipoib/ipoib_main.c    |   17 -
>  include/rdma/ib_verbs.h                      |  204 +++++++++++++++-
>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c      |    6
>  net/sunrpc/xprtrdma/svc_rdma_transport.c     |   51 +---
>  35 files changed, 584 insertions(+), 368 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> body of a message to majordomo@...r.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ