netdev - Re: [PATCH v2 net-next 0/3] rds: IPv6 support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180706151402.GI30983@oracle.com>
Date:   Fri, 6 Jul 2018 11:14:02 -0400
From:   Sowmini Varadhan <sowmini.varadhan@...cle.com>
To:     Ka-Cheong Poon <ka-cheong.poon@...cle.com>
Cc:     netdev@...r.kernel.org, santosh.shilimkar@...cle.com,
        davem@...emloft.net, rds-devel@....oracle.com
Subject: Re: [PATCH v2 net-next 0/3] rds: IPv6 support

On (07/06/18 22:36), Ka-Cheong Poon wrote:
> This patch series does not change existing behavior.  But
> I think this is a strange RDS semantics as it differs from
> other types of socket.  But this is not about IPv6 support
> and can be dealt with later.

sure, 

> >   Since we are choosing to treat this behavior as a feature we
> >   need to be consistent in rds_getname(). If we find
> >   (!peer and !ipv6_addr_any(rs_conn_addr)) and the socket is not yet bound,
> >   the returned address (address size, address family) should be based
> >   on the rs_conn_addr. (Otherwise if I connect to abc::1 without doing a
> >   bind(), and try to do a getsockname(), I get back a sockaddr_in?!)
> 
> 
> OK, will change that.
> >- rds_cancel_sent_to() and rds_connect and rds_bind and rds_sendmsg
> >   As DaveM has already pointed out, we should be using sa_family to figure
> >   out sockaddr_in vs sockaddr_in6 (not the other way around of inspecting
> >   len first, and then the family- that way wont work if I pass in
> >   sockaddr_storage).  E.g., see inet_dgram_connect.
> >
> >         if (addr_len < sizeof(uaddr->sa_family))
> >                 return -EINVAL;
> OK, will change that.

thank you

> >- In net/rds/rds.h;
> >
> >   /* The following ports, 16385, 18634, 18635, are registered with IANA as
> >    * the ports to be used for RDS over TCP and UDP.  18634 is the historical
> >
> >   What is "RDS over TCP and UDP"? There is no such thing as RDS-over-UDP.
> >   IN fact RDS has nothing to do with UDP. The comment is confused. See next
> >   item below, where the comment disappears.
> 
> 
> As mentioned in the above comments, they are registered with IANA.

16385 is registered for  RDS-TCP. 

what does it mean to run rds-tcp and rds_rdma with the CM at the same time? 
Is this possible?  (if it is, why not poach some other number like 2049
from NFS?!)

18635 is actually not used in the RDS code.
Why can you not use that for RDS_CM_PORT? 

In general, please do NOT pull up these definitions into net/rds/rds.h
They may change in the future- and the really belong in the transport
specific header file - you question this further below, see my answer
there.

> There is no RDS/UDP in Linux but the port numbers are registered
> nevertheless.  And RDS can work over UDP AFAICT.  BTW, it is really

No it cannot. RDS needs a reliable transport.  If you start using
UDP (or pigeon-carrier) for the transport you have to build the
reliability yourself.

The Oracle DB libraries either use RDS-TCP or RDS-IB or UDP (in the UDP
case they do their own congestion management in userspace, and 
history has shown that it is hard to get that right, thus the desire to
switch to rds-tcp).

Anyway, even though Andy Grover registered 18635 for "RDS-IB UDP",
that is probably the most harmless one to use for this case, because
- it is unused, 
- it calls itself rds-Infiniband,  
- the likelihood of doing rds-udp is infinitesmally small (it is more likely
  that we will need to send packets from abc::1 -> fe80:2 before we need
  rds-udp :-) :-) :-))
- and if rds-udp happens, we can use 16385/udp for rds-udp

so please use 18635 for your RDS_CM_PORT and move it to IB specific
header files and lets converge this thread.

Towing RDS_TCP_PORT around has absolutely nothing to do with your
ipv6 patch set.

> strange that RDS/TCP does not use the port number already registered.
> Anyway, the above comments are just a note on this fact in the IANA

16385 was in defacto use for a long time (since 2.6 kernels), thus I
registered it as well when I started fixing this up. 

the 18634/18635 are registered for rds-iB, (caps intended) not rds-tcp.
They are available for your use.

> database.  The following is a link to the IANA assignment.

yes, I am aware

> IMHO, there should only be RDS_PORT in the first place.  And it
> can be used for all transports.  For example, if RDS/SCTP is added,
> it can also use the same port number.  There is no need to define

if/when we add rds/sctp, we shall keep that in mind. 

This is getting to be "arguing for the sake of arguing". I dont have
the time for that.

> a new port number for each transport.  What is the rationale that
> each transport needs to have its own number and be defined in its
> own header file?

Some transports may not even need a port number. Some may need
several. Some may use sysctl (suggested by Tom Herbert) to make
this configurable. Until recently, we had rds/iWARP, that may need
its own transport hooks, it does not make sense to peanut-butter
that into the core module.

That is why it has to be in the transport. I hope that answers the
question.

> If the behavior of a software component is modified/enhanced such
> that the existing API of this component has problems dealing with
> this new behavior, should the API be modified or a new API be added
> to handle that?  If neither is done, shouldn't this be considered
> a bug?

whatever. The design (parallelization for perf) is fine. Some
parts of the API are work in progress based on priority/need. 
I dont want to bicker over this one, except to note that the judgemental
nature of the comment is interesting.

> >   Also, while you are there, s/exisiting/existing, please.
> 
> 
> OK, with change that.

Wonderful.

For the rest, I repeat: Oracle Clusters are using UDP/IPV6 today
(with no RDS). You need feature compat with UDP for that reason.

--Sowmini