linux-kernel - Re: Revert: SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CB98840B-1CEE-48C6-9ABB-82DA8D213BAF@primarydata.com>
Date:	Fri, 1 Jul 2016 23:02:23 +0000
From:	Trond Myklebust <trondmy@...marydata.com>
To:	Rostedt Steven <rostedt@...dmis.org>
CC:	LKML <linux-kernel@...r.kernel.org>,
	Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
	Jeff Layton <jlayton@...chiereds.net>,
	"Eric Dumazet" <eric.dumazet@...il.com>,
	Schumaker Anna <anna.schumaker@...app.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Fields Bruce <bfields@...ldses.org>,
	Torvalds Linus <torvalds@...ux-foundation.org>
Subject: Re: Revert: SUNRPC: xs_sock_mark_closed() does not need to trigger
 socket autoclose


> On Jul 1, 2016, at 18:39, Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> On Fri, 1 Jul 2016 22:34:02 +0000
> Trond Myklebust <trondmy@...marydata.com> wrote:
> 
> 
>> NACK. This ocde was removed on purpose because it is dangerous to
>> have the TCP state change callbacks queue up a new close(). The
>> connect code sometimes has to close sockets that are misbehaving, and
>> so we’ve seen races whereby the old socket closes and triggers an
>> autoclose for the new socket while it is connecting.
> 
> OK fine. But can we please come up with a solution to get rid of the
> hidden port issue. It's very annoying that I get a message from
> rkhunter ever morning telling me "Please inspect this machine, because
> it may be infected.”
> 

Can we look into why the socket disconnect is happening in the first place? It’s presumably not the server, since that _would_ trigger an autoclose when the socket hits TCP_CLOSE_WAIT. That puts the two top suspects being the TCP keepalive and the TCP_USER_TIMEOUT. Are there any tracepoints we could use to look at whether or not they are triggering a close?

Thanks
  Trond