lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b4ff443-f8a9-26a8-8342-ae78b999335b@uls.co.za>
Date:   Fri, 1 Apr 2022 02:33:25 +0200
From:   Jaco Kroon <jaco@....co.za>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Neal Cardwell <ncardwell@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP
 connections

Hi,

On 2022/04/01 02:10, Eric Dumazet wrote:
> On Thu, Mar 31, 2022 at 4:06 PM Jaco Kroon <jaco@....co.za> wrote:
>> Hi Neal,
>>
>> This sniff was grabbed ON THE CLIENT HOST.  There is no middlebox or
>> anything between the sniffer and the client.  Only the firewall on the
>> host itself, where we've already establish the traffic is NOT DISCARDED
>> (at least not in filter/INPUT).
>>
>> Setup on our end:
>>
>> 2 x routers, usually each with a direct peering with Google (which is
>> being ignored at the moment so instead traffic is incoming via IPT over DD).
>>
>> Connected via switch to
>>
>> 2 x firewalls, of which ONE is active (they have different networks
>> behind them, and could be active / standby for different networks behind
>> them - avoiding active-active because conntrackd is causing more trouble
>> than it's worth), Linux hosts, using netfilter, has been operating for
>> years, no recent kernel upgrades.
> Next step would be to attempt removing _all_ firewalls, especially not
> common setups like yours.
That I'm afraid is not going to happen here.  I can't imagine what we're
doing is that uncommon.  On the host basically for INPUT drop invalid,
ACCEPT related established, accept specific ports, drop everything
else.  Other than the redirects in NAT there really isn't anything "funny".
>
> conntrack had a bug preventing TFO deployment for a while, because
> many boxes kept buggy kernel versions for years.

We don't use conntrackd, we tried many years back, but eventually we
just ended up using ucarp with /32s on the interfaces and whatever
subnet is required for the floating IP itself, combined with OSPF to
sort out the routing, that way we get to avoid asymmetric routing and
the need for conntrackd.  The core firewalls basically on FORWARD does
some directing based on ingress and/or egress interface to determine
ruleset to apply, again INVALID and RELATED,ESTABLISHED rules at the head.

>
> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix
> tcp_in_window for Fast Open

This is from Aug 9, 2013 ... our firewall's kernel isn't that old :). 
Again, the traffic was sniffed on the client side of that firewall, and
the only firewall between the sniffer and the processing part of the
kernel is the local netfilter.

I'll deploy same on a dev host we've got in the coming week and start a
bisect process.

Kind Regards,
Jaco

Powered by blists - more mailing lists