lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aR3ZFSOawH-y_A3q@orbyte.nwl.cc>
Date: Wed, 19 Nov 2025 15:49:57 +0100
From: Phil Sutter <phil@....cc>
To: Hamza Mahfooz <hamzamahfooz@...ux.microsoft.com>
Cc: netdev@...r.kernel.org, Pablo Neira Ayuso <pablo@...filter.org>,
	Jozsef Kadlecsik <kadlec@...filter.org>,
	Florian Westphal <fw@...len.de>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, netfilter-devel@...r.kernel.org,
	coreteam@...filter.org, linux-kernel@...r.kernel.org
Subject: Re: Soft lock-ups caused by iptables

Hi,

On Tue, Nov 18, 2025 at 02:17:35PM -0800, Hamza Mahfooz wrote:
> I am able to consistly repro several cpu soft lock-ups that seem to all
> end up in either in nft_chain_validate(), nft_match_validate(), or
> nft_match_validate(), see below for examples. Also, this doesn't seem
> to be a recent regression since I am able to repro it as far back as
> v5.15.184. The repro steps are rather convoluted (involving a config
> with a ~40k iptables rules and 2 vCPUs) so I am happy to test any
> patches. You can find the config I used to build the 6.18 kernel at [1].

Nftables ruleset validation code was refactored in v6.10 with commit
cff3bd012a95 ("netfilter: nf_tables: prefer nft_chain_validate"). This
is also present in v5.15.184, so in order to estimate whether a bug is
"new" or "old", better really use old kernels not recent minor releases
of old major ones. :)

Anyway, basically what happens is that nft_chain_validate() iterates
over each rule's expressions calling their 'validate' callback if
present. With nft_immediate, this leads to a recursive call to
nft_chain_validate() if the verdict is a jump/goto call. There is a
recursion limit involved, but chains are potentially revalidated
multiple times to cover all possible flow paths (e.g. with consecutive
rules jumping to the same chain).

So, how many --jump/--goto calls does your 40k iptables dump contain? Is
this a (penetration) test or an actual ruleset in use? While it might be
possible to reduce the overhead involved with this chain validation,
maybe you want to consider using ipset (or better, nftables and its
verdict maps) to improve the ruleset in general?

On nftables side, maybe we could annotate chains with a depth value once
validated to skip digging into them again when revisiting from another
jump?

Cheers, Phil

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ