linux-kernel - Re: [PATCH V6] netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20251201110841.9519-1-xiafei_xupt@163.com>
Date: Mon,  1 Dec 2025 19:08:41 +0800
From: lvxiafei <xiafei_xupt@....com>
To: fw@...len.de
Cc: coreteam@...filter.org,
	davem@...emloft.net,
	edumazet@...gle.com,
	horms@...nel.org,
	kadlec@...filter.org,
	kuba@...nel.org,
	linux-kernel@...r.kernel.org,
	lvxiafei@...setime.com,
	netdev@...r.kernel.org,
	netfilter-devel@...r.kernel.org,
	pabeni@...hat.com,
	pablo@...filter.org,
	xiafei_xupt@....com
Subject: Re: [PATCH V6] netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_max sysctl

>  I've applied a variant of this patch to nf-next:testing.
>  
>  Could you please check that I adapted it correctly?
>  https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git/commit/?h=testing&id=b7bfa7d96fa5a7f3c2a69ad406ede520e658cb07
>  
>  (I added a patch right before that rejects conntrack_max=0).

Historically, some systems or scripts have used 0 to mean “unlimited”.
In this way, some scripts that are set to 0 need to be adjusted.
Rejecting this value may break compatibility, so it would be good to document 
this behavior change clearly in the commit message and/or changelog.

>  
>  I wonder if we should update the sysctl path to reflect the
>  effective value, i.e., so that when netns sets
>  
>  nf_conntrack_max=1000000
>  
>  ... but init_net is capped at 65536, then a listing
>  shows the sysctl at 65536.
>  
>  It would be similar to what we do for max_buckets.

I would argue against updating the sysctl path to reflect the 
effective value. Doing so could be misleading, as it would no 
longer show the value actually configured in the namespace, 
but rather the clamped or capped value. Users might feel that 
their explicit configuration has been silently altered, which 
can be frustrating for professional users who rely on precise 
control. If there is a genuine need to see the effective value, 
it can be computed on demand or exposed via a separate parameter 
specifically indicating the effective value. Keeping the sysctl 
path as-is preserves transparency, predictability, and user trust.

>  
>  I also considered to make such a request fail at set time, but it
>  would make the sysctl fail/not fail 'randomly' and it also would
>  not do the right thing when init_net setting is reduced later.

I would be cautious about making the sysctl fail at set time. Doing 
so could lead to seemingly “random” failures depending on the current 
state of init_net, which would be confusing for users. Moreover, if 
init_net’s setting is reduced later, the sysctl behavior would not be 
consistent, and users might end up with invalid or unexpected values 
anyway. It seems safer to allow the set operation to succeed but let 
the effective value be determined by the existing limits, maintaining
predictable behavior.