[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20251023-scratch-bobbyeshleman-devmem-tcp-token-upstream-v5-4-47cb85f5259e@meta.com>
Date: Thu, 23 Oct 2025 13:58:23 -0700
From: Bobby Eshleman <bobbyeshleman@...il.com>
To: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...gle.com>,
Willem de Bruijn <willemb@...gle.com>, Neal Cardwell <ncardwell@...gle.com>,
David Ahern <dsahern@...nel.org>, Mina Almasry <almasrymina@...gle.com>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, Bobby Eshleman <bobbyeshleman@...a.com>
Subject: [PATCH net-next v5 4/4] net: add per-netns sysctl for devmem
autorelease
From: Bobby Eshleman <bobbyeshleman@...a.com>
Add a new per-namespace sysctl to control the autorelease
behavior of devmem dmabuf bindings. The sysctl is found at:
/proc/sys/net/core/devmem_autorelease
When a binding is created, it inherits the autorelease setting from the
network namespace of the device to which it's being bound.
If autorelease is enabled (1):
- Tokens are stored in socket's xarray
- Tokens are automatically released when socket is closed
If autorelease is disabled (0):
- Tokens are tracked via uref counter in each net_iov
- User must manually release tokens via SO_DEVMEM_DONTNEED
- Lingering tokens are released when dmabuf is unbound
- This is the new default behavior for better performance
This allows application developers to choose between automatic cleanup
(easier, backwards compatible) and manual control (more explicit token
management, but more performant).
Changes the default to autorelease=0, so that users gain the performance
benefit by default.
Signed-off-by: Bobby Eshleman <bobbyeshleman@...a.com>
---
include/net/netns/core.h | 1 +
net/core/devmem.c | 2 +-
net/core/net_namespace.c | 1 +
net/core/sysctl_net_core.c | 9 +++++++++
4 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 9ef3d70e5e9c..7af5ab0d757b 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -18,6 +18,7 @@ struct netns_core {
u8 sysctl_txrehash;
u8 sysctl_tstamp_allow_data;
u8 sysctl_bypass_prot_mem;
+ u8 sysctl_devmem_autorelease;
#ifdef CONFIG_PROC_FS
struct prot_inuse __percpu *prot_inuse;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 8f3199fe0f7b..9cd6d93676f9 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -331,7 +331,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
goto err_free_chunks;
list_add(&binding->list, &priv->bindings);
- binding->autorelease = true;
+ binding->autorelease = dev_net(dev)->core.sysctl_devmem_autorelease;
return binding;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index adcfef55a66f..890826b113d6 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -396,6 +396,7 @@ static __net_init void preinit_net_sysctl(struct net *net)
net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
net->core.sysctl_tstamp_allow_data = 1;
net->core.sysctl_txq_reselection = msecs_to_jiffies(1000);
+ net->core.sysctl_devmem_autorelease = 0;
}
/* init code that must occur even if setup_net() is not called. */
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 8d4decb2606f..375ec395227e 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -692,6 +692,15 @@ static struct ctl_table netns_core_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE
},
+ {
+ .procname = "devmem_autorelease",
+ .data = &init_net.core.sysctl_devmem_autorelease,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE
+ },
/* sysctl_core_net_init() will set the values after this
* to readonly in network namespaces
*/
--
2.47.3
Powered by blists - more mailing lists