lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8Ci39eQNgqkTe4j@shell.armlinux.org.uk>
Date:   Fri, 13 Jan 2023 00:16:31 +0000
From:   "Russell King (Oracle)" <linux@...linux.org.uk>
To:     Florian Westphal <fw@...len.de>
Cc:     netdev@...r.kernel.org, netfilter-devel@...r.kernel.org,
        coreteam@...filter.org
Subject: Re: 6.1: possible bug with netfilter conntrack?

Hi Florian,

Thanks for the quick reply.

On Fri, Jan 13, 2023 at 12:38:08AM +0100, Florian Westphal wrote:
> Russell King (Oracle) <linux@...linux.org.uk> wrote:
> > Hi,
> > 
> > I've noticed that my network at home is rather struggling, and having
> > done some investigation, I find that the router VM is dropping packets
> > due to lots of:
> > 
> > nf_conntrack: nf_conntrack: table full, dropping packet
> > 
> > I find that there are about 2380 established and assured connections
> > with a destination of my incoming mail server with destination port 25,
> > and 2 packets. In the reverse direction, apparently only one packet was
> > sent according to conntrack. E.g.:
> > 
> > tcp      6 340593 ESTABLISHED src=180.173.2.183 dst=78.32.30.218
> > sport=49694 dport=25 packets=2 bytes=92 src=78.32.30.218
> > dst=180.173.2.183 sport=25 dport=49694 packets=1 bytes=44 [ASSURED]
> > use=1
> 
> Non-early-evictable entry that will expire in ~4 days, so not really
> surprising that this eventually fills the table.
> 
> I'd suggest to reduce the
> net.netfilter.nf_conntrack_tcp_timeout_established
> sysctl to something more sane, e.g. 2 minutes or so unless you need
> to have longer timeouts.
> 
> But this did not change, so not the root cause of this problem.

I'll hold off trying that for now - I do tend to have some connections
that may be idle...

> > However, if I look at the incoming mail server, its kernel believes
> > there are no incoming port 25 connetions, which matches exim.
> > 
> > I hadn't noticed any issues prior to upgrading from 5.16 to 6.1 on the
> > router VM, and the firewall rules have been the same for much of
> > 2021/2022.
> >
> > Is this is known issue? Something changed between 5.16 and 6.1 in the
> > way conntrack works?
> 
> Nothing that should have such an impact.
> 
> Does 'sysctl net.netfilter.nf_conntrack_tcp_loose=0' avoid the buildup
> of such entries? I'm wondering if conntrack misses the connection
> shutdown or if its perhaps triggering the entries because of late
> packets or similar.
> 
> If that doesn't help. you could also check if
> 
> 'sysctl net.netfilter.nf_conntrack_tcp_be_liberal=1' helps -- if it
> does, its time for more debugging but its too early to start digging
> atm.  This would point at conntrack ignoring/discarding fin/reset
> packets.

I think first I need to work out how the issue arises, since it seems
to be behaving normally at the moment. I have for example:

$ grep 173.239.196.95 bad-conntrack.log | wc -l
314

and this resolves to 173-239-196-95.azu1ez9l.com. It looks like exim
was happy with that, so would have issued its SMTP banner very shortly
after the connection was established, but all the entries in the
conntrack table show packets=2...packets=1 meaning conntrack only
saw the SYN, SYNACK and ACK packets establishing the connection, but
not the packet sending the SMTP banner which seems mightily weird.

I've just tried this from a machine on the 'net, telneting in to the
SMTP port, the conntrack packet counters increase beyond 2/1, and when
exim times out the connection, the conntrack entry goes away - so
everything seems to work how it should.

Digging through the logs, it looks like the first table-full happened
twice on Dec 30th, just two and a half days after boot. Then eight
times on Jan 10th, and from the 11th at about 11pm, the logs have been
sporadically flooded with the conntrack table full messages.

I'll try to keep an eye on it and dig out something a bit more useful
which may help locate what the issue is, but it seems the trigger
mechanism isn't something obvious.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ