linux-kernel - Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080613063037.GA16943@elte.hu>
Date:	Fri, 13 Jun 2008 08:30:37 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	David Miller <davem@...emloft.net>
Cc:	kuznet@....inr.ac.ru, vgusev@...nvz.org, mcmanus@...ksong.com,
	xemul@...nvz.org, netdev@...r.kernel.org,
	ilpo.jarvinen@...sinki.fi, linux-kernel@...r.kernel.org
Subject: Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets


* David Miller <davem@...emloft.net> wrote:

> From: David Miller <davem@...emloft.net>
> Date: Wed, 11 Jun 2008 16:52:55 -0700 (PDT)
> 
> > More and more, the arguments are mounting to completely revert the 
> > established code path changes, and frankly that is likely what I am 
> > going to do by the end of today.
> 
> Here is the revert patch I intend to send to Linus:
> 
> tcp: Revert 'process defer accept as established' changes.
> 
> This reverts two changesets, ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
> ("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
> the follow-on bug fix 9ae27e0adbf471c7a6b80102e38e1d5a346b3b38
> ("tcp: Fix slab corruption with ipv6 and tcp6fuzz").
> 
> This change causes several problems, first reported by Ingo Molnar
> as a distcc-over-loopback regression where connections were getting
> stuck.
> 
> Ilpo Järvinen first spotted the locking problems.  The new function
> added by this code, tcp_defer_accept_check(), only has the
> child socket locked, yet it is modifying state of the parent
> listening socket.
> 
> Fixing that is non-trivial at best, because we can't simply just grab
> the parent listening socket lock at this point, because it would
> create an ABBA deadlock.  The normal ordering is parent listening
> socket --> child socket, but this code path would require the
> reverse lock ordering.
> 
> Next is a problem noticed by Vitaliy Gusev, he noted:
> 
> ----------------------------------------
> >--- a/net/ipv4/tcp_timer.c
> >+++ b/net/ipv4/tcp_timer.c
> >@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> > 		goto death;
> > 	}
> >
> >+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
> >+		tcp_send_active_reset(sk, GFP_ATOMIC);
> >+		goto death;
> 
> Here socket sk is not attached to listening socket's request queue. tcp_done()
> will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
> release this sk) as socket is not DEAD. Therefore socket sk will be lost for
> freeing.
> ----------------------------------------
> 
> Finally, Alexey Kuznetsov argues that there might not even be any
> real value or advantage to these new semantics even if we fix all
> of the bugs:
> 
> ----------------------------------------
> Hiding from accept() sockets with only out-of-order data only
> is the only thing which is impossible with old approach. Is this really
> so valuable? My opinion: no, this is nothing but a new loophole
> to consume memory without control.
> ----------------------------------------
> 
> So revert this thing for now.
> 
> Signed-off-by: David S. Miller <davem@...emloft.net>

the 3 reverts have been extensively tested in -tip via:

 # tip/out-of-tree: 9e5b6ca: tcp: revert DEFER_ACCEPT modifications

and the distcc problems are fixed. (The locking fix alone did not fix it 
conclusively in my testing, possibly due to the follow-on observations 
outlined in your description.)

Tested-by: Ingo Molnar <mingo@...e.hu>

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/