[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <41ce587d-dfaa-fe6b-66a8-58ba1a3a2872@thelounge.net>
Date: Wed, 14 Aug 2019 12:19:06 +0200
From: Reindl Harald <h.reindl@...lounge.net>
To: Thomas Jarosch <thomas.jarosch@...ra2net.com>,
Sasha Levin <sashal@...nel.org>
Cc: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
Florian Westphal <fw@...len.de>,
Jakub Jankowski <shasta@...corp.com>,
Jozsef Kadlecsik <kadlec@...ckhole.kfki.hu>,
Pablo Neira Ayuso <pablo@...filter.org>,
netfilter-devel@...r.kernel.org, coreteam@...filter.org,
netdev@...r.kernel.org
Subject: Re: [PATCH AUTOSEL 4.19 04/42] netfilter: conntrack: always store
window size un-scaled
that's still not in 5.2.8
without the exception and "nf_conntrack_tcp_timeout_max_retrans = 60" a
vnc-over-ssh session having the VNC view in the background freezes
within 60 secods
-----------------------------------------------------------------------------------------------
IPV4 TABLE MANGLE (STATEFUL PRE-NAT/FILTER)
-----------------------------------------------------------------------------------------------
Chain PREROUTING (policy ACCEPT 100 packets, 9437 bytes)
num pkts bytes target prot opt in out source
destination
1 6526 3892K ACCEPT all -- * * 0.0.0.0/0
0.0.0.0/0 ctstate RELATED,ESTABLISHED
2 125 6264 ACCEPT all -- lo * 0.0.0.0/0
0.0.0.0/0
3 64 4952 ACCEPT all -- vmnet8 * 0.0.0.0/0
0.0.0.0/0
4 1 40 DROP all -- * * 0.0.0.0/0
0.0.0.0/0 ctstate INVALID
-------- Weitergeleitete Nachricht --------
Betreff: [PATCH AUTOSEL 5.2 07/76] netfilter: conntrack: always store
window size un-scaled
Am 08.08.19 um 11:02 schrieb Thomas Jarosch:
> Hello together,
>
> You wrote on Fri, Aug 02, 2019 at 09:22:24AM -0400:
>> From: Florian Westphal <fw@...len.de>
>>
>> [ Upstream commit 959b69ef57db00cb33e9c4777400ae7183ebddd3 ]
>>
>> Jakub Jankowski reported following oddity:
>>
>> After 3 way handshake completes, timeout of new connection is set to
>> max_retrans (300s) instead of established (5 days).
>>
>> shortened excerpt from pcap provided:
>> 25.070622 IP (flags [DF], proto TCP (6), length 52)
>> 10.8.5.4.1025 > 10.8.1.2.80: Flags [S], seq 11, win 64240, [wscale 8]
>> 26.070462 IP (flags [DF], proto TCP (6), length 48)
>> 10.8.1.2.80 > 10.8.5.4.1025: Flags [S.], seq 82, ack 12, win 65535, [wscale 3]
>> 27.070449 IP (flags [DF], proto TCP (6), length 40)
>> 10.8.5.4.1025 > 10.8.1.2.80: Flags [.], ack 83, win 512, length 0
>>
>> Turns out the last_win is of u16 type, but we store the scaled value:
>> 512 << 8 (== 0x20000) becomes 0 window.
>>
>> The Fixes tag is not correct, as the bug has existed forever, but
>> without that change all that this causes might cause is to mistake a
>> window update (to-nonzero-from-zero) for a retransmit.
>>
>> Fixes: fbcd253d2448b8 ("netfilter: conntrack: lower timeout to RETRANS seconds if window is 0")
>> Reported-by: Jakub Jankowski <shasta@...corp.com>
>> Tested-by: Jakub Jankowski <shasta@...corp.com>
>> Signed-off-by: Florian Westphal <fw@...len.de>
>> Acked-by: Jozsef Kadlecsik <kadlec@...ckhole.kfki.hu>
>> Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>
>> Signed-off-by: Sasha Levin <sashal@...nel.org>
>
> Also:
> Tested-by: Thomas Jarosch <thomas.jarosch@...ra2net.com>
>
> ;)
>
> We've hit the issue with the wrong conntrack timeout at two different sites,
> long-lived connections to a SAP server over IPSec VPN were constantly dropping.
>
> For us this was a regression after updating from kernel 3.14 to 4.19.
> Yesterday I've applied the patch to kernel 4.19.57 and the problem is fixed.
>
> The issue was extra hard to debug as we could just boot the new kernel
> for twenty minutes in the evening on these productive systems.
>
> The stable kernel patch from last Friday came right on time. I was just
> about the replay the TCP connection with tcpreplay, so this saved
> me from another week of debugging. Thanks everyone!
Powered by blists - more mailing lists