[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.21.1908141316420.1803@kich.toxcorp.com>
Date: Wed, 14 Aug 2019 13:17:30 +0200 (CEST)
From: Jakub Jankowski <shasta@...corp.com>
To: Reindl Harald <h.reindl@...lounge.net>
cc: Thomas Jarosch <thomas.jarosch@...ra2net.com>,
Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, Florian Westphal <fw@...len.de>,
Jozsef Kadlecsik <kadlec@...ckhole.kfki.hu>,
Pablo Neira Ayuso <pablo@...filter.org>,
netfilter-devel@...r.kernel.org, coreteam@...filter.org,
netdev@...r.kernel.org
Subject: Re: [PATCH AUTOSEL 4.19 04/42] netfilter: conntrack: always store
window size un-scaled
On 2019-08-14, Reindl Harald wrote:
> that's still not in 5.2.8
It will make its way into next 5.2.x release, as it is now in the pending
queue:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-5.2
Regards,
Jakub.
>
> without the exception and "nf_conntrack_tcp_timeout_max_retrans = 60" a
> vnc-over-ssh session having the VNC view in the background freezes
> within 60 secods
>
> -----------------------------------------------------------------------------------------------
> IPV4 TABLE MANGLE (STATEFUL PRE-NAT/FILTER)
> -----------------------------------------------------------------------------------------------
> Chain PREROUTING (policy ACCEPT 100 packets, 9437 bytes)
> num pkts bytes target prot opt in out source
> destination
> 1 6526 3892K ACCEPT all -- * * 0.0.0.0/0
> 0.0.0.0/0 ctstate RELATED,ESTABLISHED
> 2 125 6264 ACCEPT all -- lo * 0.0.0.0/0
> 0.0.0.0/0
> 3 64 4952 ACCEPT all -- vmnet8 * 0.0.0.0/0
> 0.0.0.0/0
> 4 1 40 DROP all -- * * 0.0.0.0/0
> 0.0.0.0/0 ctstate INVALID
>
> -------- Weitergeleitete Nachricht --------
> Betreff: [PATCH AUTOSEL 5.2 07/76] netfilter: conntrack: always store
> window size un-scaled
>
> Am 08.08.19 um 11:02 schrieb Thomas Jarosch:
>> Hello together,
>>
>> You wrote on Fri, Aug 02, 2019 at 09:22:24AM -0400:
>>> From: Florian Westphal <fw@...len.de>
>>>
>>> [ Upstream commit 959b69ef57db00cb33e9c4777400ae7183ebddd3 ]
>>>
>>> Jakub Jankowski reported following oddity:
>>>
>>> After 3 way handshake completes, timeout of new connection is set to
>>> max_retrans (300s) instead of established (5 days).
>>>
>>> shortened excerpt from pcap provided:
>>> 25.070622 IP (flags [DF], proto TCP (6), length 52)
>>> 10.8.5.4.1025 > 10.8.1.2.80: Flags [S], seq 11, win 64240, [wscale 8]
>>> 26.070462 IP (flags [DF], proto TCP (6), length 48)
>>> 10.8.1.2.80 > 10.8.5.4.1025: Flags [S.], seq 82, ack 12, win 65535, [wscale 3]
>>> 27.070449 IP (flags [DF], proto TCP (6), length 40)
>>> 10.8.5.4.1025 > 10.8.1.2.80: Flags [.], ack 83, win 512, length 0
>>>
>>> Turns out the last_win is of u16 type, but we store the scaled value:
>>> 512 << 8 (== 0x20000) becomes 0 window.
>>>
>>> The Fixes tag is not correct, as the bug has existed forever, but
>>> without that change all that this causes might cause is to mistake a
>>> window update (to-nonzero-from-zero) for a retransmit.
>>>
>>> Fixes: fbcd253d2448b8 ("netfilter: conntrack: lower timeout to RETRANS seconds if window is 0")
>>> Reported-by: Jakub Jankowski <shasta@...corp.com>
>>> Tested-by: Jakub Jankowski <shasta@...corp.com>
>>> Signed-off-by: Florian Westphal <fw@...len.de>
>>> Acked-by: Jozsef Kadlecsik <kadlec@...ckhole.kfki.hu>
>>> Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>
>>> Signed-off-by: Sasha Levin <sashal@...nel.org>
>>
>> Also:
>> Tested-by: Thomas Jarosch <thomas.jarosch@...ra2net.com>
>>
>> ;)
>>
>> We've hit the issue with the wrong conntrack timeout at two different sites,
>> long-lived connections to a SAP server over IPSec VPN were constantly dropping.
>>
>> For us this was a regression after updating from kernel 3.14 to 4.19.
>> Yesterday I've applied the patch to kernel 4.19.57 and the problem is fixed.
>>
>> The issue was extra hard to debug as we could just boot the new kernel
>> for twenty minutes in the evening on these productive systems.
>>
>> The stable kernel patch from last Friday came right on time. I was just
>> about the replay the TCP connection with tcpreplay, so this saved
>> me from another week of debugging. Thanks everyone!
>
--
Jakub Jankowski|shasta@...corp.com|https://toxcorp.com/
Powered by blists - more mailing lists