lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20251023-scratch-bobbyeshleman-devmem-tcp-token-upstream-v5-4-47cb85f5259e@meta.com>
Date: Thu, 23 Oct 2025 13:58:23 -0700
From: Bobby Eshleman <bobbyeshleman@...il.com>
To: "David S. Miller" <davem@...emloft.net>, 
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
 Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
 Kuniyuki Iwashima <kuniyu@...gle.com>, 
 Willem de Bruijn <willemb@...gle.com>, Neal Cardwell <ncardwell@...gle.com>, 
 David Ahern <dsahern@...nel.org>, Mina Almasry <almasrymina@...gle.com>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org, 
 linux-kernel@...r.kernel.org, Bobby Eshleman <bobbyeshleman@...a.com>
Subject: [PATCH net-next v5 4/4] net: add per-netns sysctl for devmem
 autorelease

From: Bobby Eshleman <bobbyeshleman@...a.com>

Add a new per-namespace sysctl to control the autorelease
behavior of devmem dmabuf bindings. The sysctl is found at:
/proc/sys/net/core/devmem_autorelease

When a binding is created, it inherits the autorelease setting from the
network namespace of the device to which it's being bound.

If autorelease is enabled (1):
- Tokens are stored in socket's xarray
- Tokens are automatically released when socket is closed

If autorelease is disabled (0):
- Tokens are tracked via uref counter in each net_iov
- User must manually release tokens via SO_DEVMEM_DONTNEED
- Lingering tokens are released when dmabuf is unbound
- This is the new default behavior for better performance

This allows application developers to choose between automatic cleanup
(easier, backwards compatible) and manual control (more explicit token
management, but more performant).

Changes the default to autorelease=0, so that users gain the performance
benefit by default.

Signed-off-by: Bobby Eshleman <bobbyeshleman@...a.com>
---
 include/net/netns/core.h   | 1 +
 net/core/devmem.c          | 2 +-
 net/core/net_namespace.c   | 1 +
 net/core/sysctl_net_core.c | 9 +++++++++
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 9ef3d70e5e9c..7af5ab0d757b 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -18,6 +18,7 @@ struct netns_core {
 	u8	sysctl_txrehash;
 	u8	sysctl_tstamp_allow_data;
 	u8	sysctl_bypass_prot_mem;
+	u8	sysctl_devmem_autorelease;
 
 #ifdef CONFIG_PROC_FS
 	struct prot_inuse __percpu *prot_inuse;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 8f3199fe0f7b..9cd6d93676f9 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -331,7 +331,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 		goto err_free_chunks;
 
 	list_add(&binding->list, &priv->bindings);
-	binding->autorelease = true;
+	binding->autorelease = dev_net(dev)->core.sysctl_devmem_autorelease;
 
 	return binding;
 
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index adcfef55a66f..890826b113d6 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -396,6 +396,7 @@ static __net_init void preinit_net_sysctl(struct net *net)
 	net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
 	net->core.sysctl_tstamp_allow_data = 1;
 	net->core.sysctl_txq_reselection = msecs_to_jiffies(1000);
+	net->core.sysctl_devmem_autorelease = 0;
 }
 
 /* init code that must occur even if setup_net() is not called. */
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 8d4decb2606f..375ec395227e 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -692,6 +692,15 @@ static struct ctl_table netns_core_table[] = {
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_ONE
 	},
+	{
+		.procname	= "devmem_autorelease",
+		.data		= &init_net.core.sysctl_devmem_autorelease,
+		.maxlen		= sizeof(u8),
+		.mode		= 0644,
+		.proc_handler	= proc_dou8vec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE
+	},
 	/* sysctl_core_net_init() will set the values after this
 	 * to readonly in network namespaces
 	 */

-- 
2.47.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ