lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 26 Sep 2010 20:35:15 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Willy Tarreau <w@....eu>
Cc:	netdev@...r.kernel.org
Subject: Re: TCP: orphans broken by RFC 2525 #2.17

Le dimanche 26 septembre 2010 à 19:40 +0200, Willy Tarreau a écrit :
> Hi Eric,
> 
> On Sun, Sep 26, 2010 at 07:02:47PM +0200, Eric Dumazet wrote:
> > How could we delay the close() ? We must either send a FIN or RST.
> 
> I don't mean to delay the close(), but I'm aware that my description
> was not very clear.
> 
> Here's what I would find normal :
> 
> 1) upon close(), we send a FIN, whether there are incoming pending
>    data or not (after all, the only difference is only a timing
>    issue, as the data in the rx buffer might very well come just
>    after the FIN, as it almost always does, BTW). The connection
>    then becomes FIN_WAIT1 just as now.
> 
> 2) mark the socket as orphaned
> 
> 3) when an ACK comes from the other side, either it's below our last
>    seq, and we simply ignore it, just as if we were in TIME_WAIT, or
>    it is equal to the last seq and indicates that it's now safe to
>    reset ; we would then just send the RST to notify the other side
>    that the data it sent were not read. The connection can then either
>    be destroyed or put in TIME_WAIT. It's the point where the connection
>    normally switches from FIN_WAIT1 to FIN_WAIT2, since the FIN has been
>    acked. The only difference is that we don't need a FIN_WAIT2 state
>    for an orphan.
> 
> > I would say, fix the program, so that RST is avoided ?
> 
> Not that easy, see below.
> 
> > The program does :
> > 
> > recv() // read the request
> > send() // queue the answer
> > close() // could work if world was perfect...
> > 
> > Change it to
> > 
> > recv()
> > send()
> > shutdown()
> > recv() // read & flush in excess data
> 
> New data arrives now, close() below will cause an RST again.
> 
> > close()
> > 
> > This for sure will send FIN after all queued data is sent.
> > I am not sure the final rcv() is even needed, its Sunday after all ;)
> 
> Currently the real code (ie: not the poc I posted) does :
> 
>    recv()
>    send()
>    shutdown()
>    close()
> 
> The extra CRLF almost always happens between the recv() and send(). What
> I intend to do as a workaround is exactly what you described above, but
> I'm well aware it's not enough. It will only reduce the rate at which this
> case happens. Well, in fact, in 10 years of production at many sites, it's
> the first time such an issue is reported and it could be tracked down to
> these two extra bytes. But the workaround will not prevent the two extra
> bytes from coming after the last recv().
> 
> Also, the issue remains when processing large POST requests. Let's suppose
> the application is receiving a massive POST (eg: 10 MB) but the request is
> not authenticated, so the application returns an HTTP 401 response to
> require the client to authenticate. There's no way for the application to
> be notified that the small response was completely read by the client and
> that it's safe to close().
> 
> For these reasons, I concluded that the application can't get everything
> right and needs help from the kernel (said differently, I think that the
> RFC2525 fix is causing harm in addition to goods). In my opinion, this
> section in the RFC was added based on a few observations of trivial cases
> but was but its impact was not completely explored.
> 
> I'm willing to experiment, but I'm not much familiar with the code itself
> and sometimes I'm not sure about what I'm doing, probably that some help
> would be welcome. What I'd like to do is to implement the step 3 above,
> which is to only send the RST upon receipt of an ACK on an orphan that
> would switch a normal socket from FIN_WAIT1 to FIN_WAIT2.
> 
> Also, I'm not sure about what other OSes are doing. For instance, I tried
> on Solaris and did not observe the issue at all, though I think that
> Solaris simply does not implement the RFC2525 recommendation.
> 
> Have a nice sunday evening ;-)
> Willy
> 

I was referring to this code. It works well for me.

shutdown(fd, SHUT_RDWR);
while (recv(fd, buff, sizeof(buff), 0) > 0)
	;
close(fd);



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ