[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a873ea1-ba83-1506-9172-e955d5f9ae16@redhat.com>
Date: Wed, 17 May 2017 20:07:00 -0400
From: Doug Ledford <dledford@...hat.com>
To: Parav Pandit <parav@...lanox.com>,
David Miller <davem@...emloft.net>
Cc: "Bart.VanAssche@...disk.com" <Bart.VanAssche@...disk.com>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
"hch@....de" <hch@....de>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
"ubraun@...ux.vnet.ibm.com" <ubraun@...ux.vnet.ibm.com>
Subject: Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure
On 5/17/2017 6:37 PM, Parav Pandit wrote:
> Hi Doug,
>
>> It would have been better with AF_INET/AF_INET6 and an option to enable
>> SMC than AF_SMC. The first implementation simply assumes AF_INET in
>> the presence of AF_SMC. When IPv6 support is added, some sort of
>> guessing logic will have to be put in place to try and determine if an
>> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
>> guaranteed way of telling. Apps can use struct sockaddr_storage as their
>> normal element to stick the address into, and could rely on the kernel to
>> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
>> this breaks that. The RFC gives *some* thought to adding IPv6 in the
>> future, but not a lot. It may be that the answer is that in the future, IPv6
>> support is enabled by making the IPv6 API be
>> AF_INET6 + setsockopt(SMC) or the equivalent. If that's the case, then I
>> would suggest making the later API specifically call out AF_INET +
>> setsockopt(SMC) be identical to AF_SMC.
>>
>
> What are the shortcomings in my proposal [1] which I am reiterating below.
> Bart also suggested to define new stream protocol for SMC similar to SCTP.
>
> (a) address family should be either AF_INET or AF_INET6
> (b) socket() API to introduce new protocol as PROTO_SMC for in socket() system call.
>
> With this there is no additional setsockop() needed.
>
> With this - user space applications, do getaddrinfo() with hint as
> hints.ai_family = AF_INET;
> hints.ai_socktype = SOCK_STREAM;
>
> getaddrinfo() returns back both the protocols TCP and SMC.
> Famous database application such as Redis client iterates over all entries of getaddrinfo() and establishes connection to servers.
>
> There are few advantages of this interface.
> 1. No change in any makefile of applications needed - unless user wants to specify explicitly that it wants to force SMC protocol.
> 2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because bind() connect() of SMC checks for AF_INET).
> 3. No major changes to glibc to process AF_SMC differently
> glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ places).
> A lot simpler test matrix for glibc for new protocol
> 5. No need to recompile applications, as long as getaddrinfo returns all streaming protocols (TCP, SMC)
> 6. for applications to make use of setsockopt() it needs another knob and hint from other places, which can be avoided because SMC TCP option negotiates with remote end
>
> And representing new protocol as new protocol for a given address family appears correct, compare to setting socket options.
>
> Tools like CRIU or similar tool might find a race conditions - when it queries socket option, SMC was not set, but later on SMC was set, and does incorrect handling.
> Setting socket() with SMC protocol makes it easier to understand in this area as well.
I have no problem with the proposal in itself, but as IBM released this
publication and did their own implementation prior to submitting things
upstream, and as there might exist in the field implementations, it
depends on whether we wish to call those in the field implementations
experimental and break them as we go to a final implementation of
version 1, or if we consider version 1 baked. I'm fine breaking it.
After all, that's what happened with XRC back in the day and Mellanox
learned a valuable lesson about upstream first. I have no problem with
IBM learning that same lesson IMO. So, I find your proposal, including
breaking the API of the version 1 implementation just taken into the
kernel before it has had time to fully sit and gel, acceptable.
But this is where we kind of need a judgment from on high, and why I
Cc:ed Linus on this thread. Any input on this issue Linus?
> I have additional proposal for link groups, resource creation area. I will take that up after this discussion.
Look forward to hearing it.
> [1] https://patchwork.kernel.org/patch/9719375/
--
Doug Ledford <dledford@...hat.com>
GPG Key ID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Download attachment "signature.asc" of type "application/pgp-signature" (885 bytes)
Powered by blists - more mailing lists