linux-kernel - Re: [PREEMPT-RT] [patch 4 14/22] timer: Switch to a non cascading wheel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJ6YmfgYZ=d4k1XoomhY7y5z1TxxXS-fg=Nsnyiva0A9g@mail.gmail.com>
Date:	Tue, 16 Aug 2016 10:35:22 -0400
From:	Eric Dumazet <edumazet@...gle.com>
To:	Richard Cochran <richardcochran@...il.com>
Cc:	Jouni Malinen <jkmalinen@...il.com>, rcochran@...utronix.de,
	Thomas Gleixner <tglx@...utronix.de>,
	Rik van Riel <riel@...hat.com>, Len Brown <lenb@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	George Spelvin <linux@...encehorizons.net>,
	Josh Triplett <josh@...htriplett.org>,
	Chris Mason <clm@...com>, rt@...utronix.de,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	Arjan van de Ven <arjan@...radead.org>, j <j@...fi>
Subject: Re: [PREEMPT-RT] [patch 4 14/22] timer: Switch to a non cascading wheel

On Tue, Aug 16, 2016 at 5:46 AM, Richard Cochran
<richardcochran@...il.com> wrote:
> Jouni,
>
> If I understand the test correctly, then the slightly different kernel
> timer behavior is ok, but the test isn't quite right.  Let explain
> what I mean.
>
> First off, reading test_ap_wps.py, the point of the test is to see if
> ten simultaneous connections are possible.  I guess the server
> implements a hard coded limit on the number of clients.  (BTW where is
> the server loop?)
>
> You said that the server also sets 'backlog' to ten.  The backlog
> controls the size of the queue holding incoming connections that are
> in the SYN_RCVD or ESTABLISHED state but have not yet been
> accept(2)-ed by the server.  This is *not* the same as the number of
> possible simultaneous connections.
>
> On Sat, Aug 13, 2016 at 12:12:26PM +0300, Jouni Malinen wrote:
>> Yes, it looks like a TCP connect() timeout. I use a significantly
>> reduced timeout in the test scripts since they are run unattended and
>> are supposed to terminate in reasonable amount of time.. That said,
>
> I did not find where the client sets the one second timeout.  Where
> does this happen?
>
>> If I increase that 20 to 50, I get more of such about 1.03 second
>> results at i=17, i=34, i=48..
>
> Can you provide the timings when the test runs on the older kernel?
>
>> Looking more at what exactly is happening at the TCP layer, this is
>> likely related to the server behavior since listen() backlog is set to
>> 10 and if there are 10 parallel connections, the last one if
>> immediately closed before reading anything.
>
> To clarify, when the backlog is exceed, the new connection is not
> closed.  Instead, the SYN is simply ignored, and the client is expect
> to re-transmit the SYN in the normal TCP fashion.
>
>> Looking at a sniffer capture (*), the three-way TCP connection goes
>> through fine for the first 15 connect() calls, but the 15th one does
>> not get a response to SYN. This SYN is the frame 47 in the capture
>> file with srcport == 60802. There is no SYN,ACK for it. The about one
>> second unexpected time for connect() comes from this, i.e., the
>> connection is completed only after the client side does TCP
>> retransmission of the SYN (frame #77) a second later and the server
>> side replies with RST,ACK (frame #78).
>
> This is the expected behavior.
>
>> So it looks like the issue is in one of the SYN,ACK frames getting
>> completely lost..
>
> No, the frame is not missing.  It was never sent because the backlog
> was exceeded.
>
> Here is what I suspect is happening.  By sending 20 SYN frames to a
> port with a backlog of 10, it saturates the queue.  One SYN is ignored
> by the kernel, and a race begins between the connect() timeout and the
> SYN re-transmission.  If the client's re-transmitted SYN and then the
> server's SYN,ACK returns before the connect timeout, then the call to
> connect() succeeds.  With the new timer wheel, the result of the race
> is different.
>
> There a couple of ways to deal with this.  One is to increase the
> backlog on the server side.  Another is to increase the connect()
> timeout to a multiple of the re-transmission interval.
>
> Thoughts?
>

I am coming late to the party, but yes, test looks flaky.

(Relying on having very precise SYN retransmits when listen backlog on
server side is full)