lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Jul 2015 14:42:32 +0200
From:	Ursula Braun <ubraun@...ux.vnet.ibm.com>
To:	davem@...emloft.net
Cc:	utz.bacher@...ibm.com, netdev@...r.kernel.org,
	linux-s390@...r.kernel.org, schwidefsky@...ibm.com,
	heiko.carstens@...ibm.com, ursula.braun@...ibm.com,
	ubraun@...ux.vnet.ibm.com
Subject: [PATCH V2 net-next 0/3] net: implement SMC-R solution

From: Ursula Braun <ursula.braun@...ibm.com>

Eric,

this is V2 of my SMC-R patches, containing especially a new version of the
required tcp changes. As you suggested, SMC-specific hooks in the TCP-code
are built only for CONFIG_AFSMC. And I come up with helpers in include files
to avoid spreading net #ifdef in C files.

V2 changes:
1. activate tcp changes for CONFIG_AFSMC only (as suggested by Eric Dumazet)
2. add additional hook in net/core/sock.c
3. fix bitfield endianness problem

Thanks,
        Ursula

In 2013, IBM introduced an optimized communications solution for the
IBM zEnterprise EC12 and BC12 (s390 in Linux terminology) that is
comprised of the IBM 10GbE RoCE Express feature with Shared Memory
Communications-RDMA (SMC-R) protocol [1].
SMC-R is designed for the enterprise data center environment and is an open
protocol as specified in the informational RFC [2]. The final draft
submitted by IBM has been approved for publication and is in the final
editorial stage. Another implementation of this protocol is available since
2013 with IBM z/OS Version 2 Release 1. 

SMC-R provides a “sockets over RDMA” solution that leverages industry
standard RDMA over Converged Ethernet (RoCE) technology.

IBM has developed a Linux implementation of the SMC-R standard. A new
socket protocol family AF_SMC is introduced. A preload library can be used
to enable TCP-based applications to use SMC-R without changes. 

Key aspects of SMC-R are: 
1. Provides optimized performance compared to standard TCP/IP over Ethernet
   within the data center for both request/response (latency) and streaming
   workloads (CPU savings) [3]. 
   Initial benchmarks on Linux on x86 processors have shown latency
   reduction of up to 52% with a throughput gain of 111% using SMC-R vs TCP
   for request/response message patterns (10 concurrent TCP connections
   with 16KB    messages) and CPU savings of up to 69% for streaming data
   patterns (single TCP connection with 20MB of data in one direction).
   [1] is currently updated to contain more detailed information on Linux
   and performance.
2. In order to preserve the traditional network administrative model the
   SMC-R protocol ties into the existing IP addresses and uses TCP's
   handshake to establish connections. This allows existing management
   tools and security infrastructure to control the creation of SMC
   connections.
3. The SMC-R protocol logically bonds multiple RoCE adapters together
   providing    redundancy with transparent fail-over for improved high
   availability, increased bandwidth and load balancing across multiple
   RDMA-capable devices.
4. Due to its handshake protocol, SMC-R is compatible with (transparent to)
   existing TCP connection load balancers that are commonly used in the
   enterprise data center environment for multi-tier application workloads.
5. SMC-R's handshake protocol allows for transparent fallback to TCP/IP,
   should one of the peers not be capable of the protocol.

Additional SMC-R overview and reference materials are available [1].  

The SMC-R “rendezvous" protocol eliminates the need for RDMA-CM and the
exchange occurs through an initial TCP connection. Building on a TCP
connection to establish an SMC-R connection solves many key requirements,
including #4 and #5 above.
The rendezvous process occurs in 2 phases: 
1. TCP/IP 3-way exchange:
   Initiated when both client and server indicate SMC-R capability by
   including TCP experimental options on the TCP/IP 3-way handshake (syn
   flows) as described in RFC6994 [4]. The ExID assigned by IANA is
   0xE2D4C3D9 [5]. 
2. SMC-R 3-way exchange:
   When both partners indicate SMC-R capability then at the completion of
   the 3-way TCP handshake the SMC-R layers in each peer take control of
   the TCP connection and exchange their RDMA credentials. If this 3-way
   exchange    completes successfully the connection continues using SMC-R.
   If the exchange is not successful the connections falls back to standard
   TCP/IP. 

References:
[1] SMC-R Overview and Reference Materials:
    http://www-01.ibm.com/software/network/commserver/SMCR/ 
[2] SMC-R Informational RFC:
    http://tools.ietf.org/html/draft-fox-tcpm-shared-memory-rdma-07
[3] Linux SMC-R Overview and Performance Summary
    (archs x86 and s390):
    http://www-01.ibm.com/software/network/commserver/SMCR/ 
[4] Shared Use of TCP Experimental Options RFC 6994:
    https://tools.ietf.org/rfc/rfc6994.txt    
[5] IANA ExID SMCR: 
    http://www.iana.org/assignments/tcp-parameters/tcp-parameters.xhtml#tcp-exids

The patch series is prepared to apply to net-next and consists of these
parts:
1. net/ipv4/tcp: TCP experimental option
2. net: definitions to establish new socket family
3. net/smc: new socket family

In the future, SMC-R will be enhanced to cover:
- IPv6 support
- Tracing
- Statistics support

Ursula Braun (3):
  tcp: introduce TCP experimental option for SMC
  net: introduce socket family constants
  smc: introduce socket family AF_SMC

 include/linux/socket.h     |    4 +-
 include/linux/tcp.h        |   16 +-
 include/net/request_sock.h |    3 +-
 include/net/smc.h          |   13 +
 include/net/tcp.h          |  145 ++
 net/Kconfig                |    1 +
 net/Makefile               |    1 +
 net/core/sock.c            |   15 +-
 net/ipv4/tcp_input.c       |    8 +
 net/ipv4/tcp_minisocks.c   |    3 +
 net/ipv4/tcp_output.c      |   23 +-
 net/smc/Kconfig            |    9 +
 net/smc/Makefile           |    3 +
 net/smc/af_smc.c           | 3142 ++++++++++++++++++++++++++++++++++++++++++
 net/smc/af_smc.h           |  706 ++++++++++
 net/smc/smc_core.c         | 3291 ++++++++++++++++++++++++++++++++++++++++++++
 net/smc/smc_llc.c          | 1597 +++++++++++++++++++++
 net/smc/smc_llc.h          |  192 +++
 net/smc/smc_proc.c         |  884 ++++++++++++
 19 files changed, 10034 insertions(+), 22 deletions(-)
 create mode 100644 include/net/smc.h
 create mode 100644 net/smc/Kconfig
 create mode 100644 net/smc/Makefile
 create mode 100644 net/smc/af_smc.c
 create mode 100644 net/smc/af_smc.h
 create mode 100644 net/smc/smc_core.c
 create mode 100644 net/smc/smc_llc.c
 create mode 100644 net/smc/smc_llc.h
 create mode 100644 net/smc/smc_proc.c

-- 
2.3.8

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists