lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 24 Jan 2017 10:51:25 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Willy Tarreau <w@....eu>
Cc:     Wei Wang <tracywwnj@...il.com>, netdev@...r.kernel.org,
        David Miller <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Yuchung Cheng <ycheng@...gle.com>, Wei Wang <weiwan@...gle.com>
Subject: Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

On Tue, 2017-01-24 at 19:34 +0100, Willy Tarreau wrote:
> Hi Eric,
> 
> On Tue, Jan 24, 2017 at 09:44:49AM -0800, Eric Dumazet wrote:
> > I believe there is a bug in this application.
> > 
> > It does not check connect() return value.
> 
> Yes in fact it does but I noticed the same thing, there's something causing
> the event not to be registered or something like this.
> 
> > When 0 is returned, it makes no sense to wait 200 ms :
> > 
> > > 06:29:24.048593 epoll_ctl(3, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLRDHUP, {u32=9, u64=9}}) = 0
> > > 06:29:24.048651 epoll_wait(3, [], 200, 0) = 0
> > > 06:29:24.048699 getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> > 
> > And it makes no sense to call connect() again :
> > 
> > > 06:29:24.048751 connect(10, {sa_family=AF_INET, sin_port=htons(8001), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRES
> > > S (Operation now in progress)
> 
> I totally agree.
> 
> > man connect
> > 
> > <quote>
> > Generally,  connection-based protocol sockets may successfully connect()
> > only once;
> > </quote>
> > 
> > 
> > I would prefer we do not add yet another bit in tcp kernel sockets, to
> > work around some oddity in your program Willy.
> 
> I'm fine with chasing the bug on my side and fixing it, but there's a
> semantic trouble anyway with returning -EINPROGRESS :
>   - connect() = 0 indicates that the connection is established
>   - then a further connect() should return -EISCONN, and does so when
>     not using TFO
> 
> man connect says this regarding EINPROGRESS :
> 
>     The socket is nonblocking and the connection cannot be completed immediately.
>     It is possible to  select(2)  or  poll(2)  for completion  by  selecting  the
>     socket  for  writing.  After  select(2) indicates writability, use getsockopt(2)
>     to read the SO_ERROR option at level SOL_SOCKET to determine whether connect()
>     completed successfully (SO_ERROR is  zero)  or  unsuccess-fully (SO_ERROR is
>     one of the usual error codes listed here, explaining the reason for the failure).
> 
> Here we clearly have an incompatibility between this EINPROGRESS saying
> that we must poll, and poll returning POLLOUT suggesting that it's now
> OK.
> 
> I'm totally fine with not using an extra bit in a scarce area, but then
> we can either add an extra argument to __inet_stream_connect() to say
> "this is sendmsg" or just add an extra flag in the last argument.
> 
> But in general I don't feel comfortable with a semantics that doesn't
> completely match the current and documented one :-/
> 
> Thanks,
> Willy


We do not return -1 / EINPROGRESS but 0

Do not call connect() twice, it is clearly not supposed to work.

Fact that it happened to work is still kept for applications not using
new features (like TCP_FASTOPEN_CONNECT), we wont break this.

I would prefer you submit _if_ needed a patch on top of Wei patch, which
was carefully tested with our ~500 packetdrill tests.



TCP_FASTOPEN_CONNECT + connect() returning 0 is already a violation of
past behavior, in the sense that no connection really happened yet.

An application exploiting this return value and consider the server is
reachable would be mistaken.

We do not support connect() + TCP_FASTOPEN_CONNECT + read(), because we
do not want to add yet another conditional test in recvmsg() fast path
for such feature, while existing sendmsg() can already be used to send a
SYN with FastOpen option.

So people are not expected to blindly add TCP_FASTOPEN_CONNECT to every
TCP socket they allocate/use. This would add too much bloat to the
kernel.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ