[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aR3ZFSOawH-y_A3q@orbyte.nwl.cc>
Date: Wed, 19 Nov 2025 15:49:57 +0100
From: Phil Sutter <phil@....cc>
To: Hamza Mahfooz <hamzamahfooz@...ux.microsoft.com>
Cc: netdev@...r.kernel.org, Pablo Neira Ayuso <pablo@...filter.org>,
Jozsef Kadlecsik <kadlec@...filter.org>,
Florian Westphal <fw@...len.de>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, netfilter-devel@...r.kernel.org,
coreteam@...filter.org, linux-kernel@...r.kernel.org
Subject: Re: Soft lock-ups caused by iptables
Hi,
On Tue, Nov 18, 2025 at 02:17:35PM -0800, Hamza Mahfooz wrote:
> I am able to consistly repro several cpu soft lock-ups that seem to all
> end up in either in nft_chain_validate(), nft_match_validate(), or
> nft_match_validate(), see below for examples. Also, this doesn't seem
> to be a recent regression since I am able to repro it as far back as
> v5.15.184. The repro steps are rather convoluted (involving a config
> with a ~40k iptables rules and 2 vCPUs) so I am happy to test any
> patches. You can find the config I used to build the 6.18 kernel at [1].
Nftables ruleset validation code was refactored in v6.10 with commit
cff3bd012a95 ("netfilter: nf_tables: prefer nft_chain_validate"). This
is also present in v5.15.184, so in order to estimate whether a bug is
"new" or "old", better really use old kernels not recent minor releases
of old major ones. :)
Anyway, basically what happens is that nft_chain_validate() iterates
over each rule's expressions calling their 'validate' callback if
present. With nft_immediate, this leads to a recursive call to
nft_chain_validate() if the verdict is a jump/goto call. There is a
recursion limit involved, but chains are potentially revalidated
multiple times to cover all possible flow paths (e.g. with consecutive
rules jumping to the same chain).
So, how many --jump/--goto calls does your 40k iptables dump contain? Is
this a (penetration) test or an actual ruleset in use? While it might be
possible to reduce the overhead involved with this chain validation,
maybe you want to consider using ipset (or better, nftables and its
verdict maps) to improve the ruleset in general?
On nftables side, maybe we could annotate chains with a depth value once
validated to skip digging into them again when revisiting from another
jump?
Cheers, Phil
Powered by blists - more mailing lists