lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f5bc9020-8a7f-7280-fb65-2efb6bde032e@linux.alibaba.com>
Date: Thu, 16 Nov 2023 19:02:56 +0800
From: Wen Gu <guwen@...ux.alibaba.com>
To: "D. Wythe" <alibuda@...ux.alibaba.com>, kgraul@...ux.ibm.com,
 wenjia@...ux.ibm.com, jaka@...ux.ibm.com, wintera@...ux.ibm.com
Cc: kuba@...nel.org, davem@...emloft.net, netdev@...r.kernel.org,
 linux-s390@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [RFC PATCH net-next] net/smc: Introduce IPPROTO_SMC for smc


On 2023/11/8 19:25, D. Wythe wrote:
> From: "D. Wythe" <alibuda@...ux.alibaba.com>
> 
> This patch attempts to initiate a discussion on creating smc socket
> via AF_INET, similar to the following code snippet:
> 
> /* create v4 smc sock */
> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
> 
> /* create v6 smc sock */
> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
> 
> As we all know, the way we currently create an SMC socket as
> follows.
> 
> /* create v4 smc sock */
> v4 = socket(AF_SMC, SOCK_STREAM, SMCPROTO_SMC);
> 
> /* create v6 smc sock */
> v6 = socket(AF_SMC, SOCK_STREAM, SMCPROTO_SMC6);
> 
> Note: This is not to suggest removing the SMC path, but rather to propose
> adding a new path (inet path).
> 
> There are several reasons why we believe it is much better than AF_SMC:
> 
> Semantics:
> 
> SMC extends the TCP protocol and switches it's data path to RDMA path if
> RDMA link is ready. Otherwise, SMC should always try its best to degrade to
> TCP. From this perspective, SMC is a protocol derived from TCP and can also
> fallback to TCP, It should be considered as part of the same protocol
> family as TCP (AF_INET and AF_INET6).
> 
> Compatibility & Scalability:
> 
> Due to the presence of fallback, we needs to handle it very carefully to
> keep the consistent with the TCP sockets. SMC has done a lot of work to
> ensure that, but still, there are quite a few issues left, such as:
> 
> 1. The "ss" command cannot display the process name and ID associated with
> the fallback socket.
> 
> 2. The linger option is ineffective when user try’s to close the fallback
> socket.
> 
> 3. Some eBPF attach points related to INET_SOCK are ineffective under
> fallback socket, such as BPF_CGROUP_INET_SOCK_RELEASE.
> 
> 4. SO_PEEK_OFF is a un-supported sock option for fallback sockets, while
> it’s of course supported for tcp sockets.
> 
> Of course, we can fix each issue one by one, but it is not a fundamental
> solution. Any changes on the inet path may require re-synchronization,
> including bug fixes, security fixes, tracing, new features and more. For
> example, there is a commit which we think is very valueable:
> 
> commit 0dd061a6a115 ("bpf: Add update_socket_protocol hook")
> 
> This commit allows users to modify dynamically the protocol before socket
> created through eBPF programs, which provides a more flexible approach
> than smc_run (LP_PRELOAD). It does not require the process restart
> and allows for controlling replacement at the connection level, whereas
> smc_run operates at the process level.
> 
> However, to benefit from it under the SMC path requires additional
> code submission while nothing changes requires to do under inet path.
> 
> I'm not saying that these issues cannot be fixed under smc path, however,
> the solution for these issues often involves duplicating work that already
> done on inet path. Thats to say, if we can be under the inet path, we can
> easily reuse the existing infrastructure.
> 
> Performance:
> 
> In order to ensure consistency between fallback sockets and TCP sockets,
> SMC creates an additional TCP socket. This introduces additional overhead
> of approximately 15%-20% for the establishment and destruction of fallback
> sockets. In fact, for the users we have contacted who have shown interest
> in SMC, ensuring consistency in performance between fallback and TCP has
> always been their top priority. Since no one can guarantee the
> availability of RDMA links, support for SMC on both sides, or if the
> user's environment is 100% suitable for SMC. Fallback is the only way to
> address those issues, but the additional performance overhead is
> unacceptable, as fallback cannot provide the benefits of RDMA and only
> brings burden right now.
> 
> In inet path, we can embed TCP sock into SMC sock, when fallback occurs,
> the socket behaves exactly like a TCP socket. In our POC, the performance
> of fallback socket under inet path is almost indistinguishable from of
> tcp socket, with less than 1% loss. Additionally, and more importantly,
> it has full feature compatibility with TCP socket.
> 

> Of course, it is also possible under smc path, but in that way, it
> would require a significant amount of work to ensure compatibility with
> tcp sockets, which most of them has already been done in inet path.
> And still, any changes in inet path may require re-synchronization.
> 
> I also noticed that there have been some discussions on this issue before.
> 
> Link: https://lore.kernel.org/stable/4a873ea1-ba83-1506-9172-e955d5f9ae16@redhat.com/
> 
> And I saw some supportive opinions here, maybe it is time to continue
> discussing this matter now.
> 

I see the reasons.

Since the introduction of IPPROTO_SMC could mean many works to current SMC
code. Could you give us a rough idea about what are you going to do in the
implementation?

And if the AF_INET+IPPROTO_SMC coexists with current AF_SMC, which one should
be chose in different situation?

Thanks,
Wen Gu


> Signed-off-by: D. Wythe <alibuda@...ux.alibaba.com>
> ---
>   include/uapi/linux/in.h | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
> index e682ab6..0c6322b 100644
> --- a/include/uapi/linux/in.h
> +++ b/include/uapi/linux/in.h
> @@ -83,6 +83,8 @@ enum {
>   #define IPPROTO_RAW		IPPROTO_RAW
>     IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
>   #define IPPROTO_MPTCP		IPPROTO_MPTCP
> +  IPPROTO_SMC = 263,		/* Shared Memory Communications		*/
> +#define IPPROTO_SMC		IPPROTO_SMC
>     IPPROTO_MAX
>   };
>   #endif

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ