[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251028060714.2970818-1-shivajikant@google.com>
Date: Tue, 28 Oct 2025 06:07:14 +0000
From: Shivaji Kant <shivajikant@...gle.com>
To: netdev@...r.kernel.org
Cc: "David S . Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Mina Almasry <almasrymina@...gle.com>, Stanislav Fomichev <sdf@...ichev.me>,
Pavel Begunkov <asml.silence@...il.com>, Pranjal Shrivastava <praan@...gle.com>,
Shivaji Kant <shivajikant@...gle.com>, Vedant Mathur <vedantmathur@...gle.com>
Subject: [PATCH] net: devmem: Remove dst (ENODEV) check in net_devmem_get_binding
The Devmem TX binding lookup function, performs a strict
check against the socket's destination cache (`dst`) to
ensure the bound `dmabuf_id` corresponds to the correct
network device (`dst->dev->ifindex == binding->dev->ifindex`).
However, this check incorrectly fails and returns `-ENODEV`
if the socket's route cache entry (`dst`) is merely missing
or expired (`dst == NULL`). This scenario is observed during
network events, such as when flow steering rules are deleted,
leading to a temporary route cache invalidation.
The parent caller, `tcp_sendmsg_locked()`, is already
responsible for acquiring or validating the route (`dst_entry`).
If `dst` is `NULL`, `tcp_sendmsg_locked()` will correctly
derive the route before transmission.
This patch removes the `dst` validation from
`net_devmem_get_binding()`. The function now only validates
the existence of the binding and its TX vector, relying on the
calling context for device/route correctness. This allows
temporary route cache misses to be handled gracefully by the
TCP/IP stack without ENODEV error on the Devmem TX path.
Reported-by: Eric Dumazet <edumazet@...gle.com>
Reported-by: Vedant Mathur <vedantmathur@...gle.com>
Suggested-by: Eric Dumazet <edumazet@...gle.com>
Fixes: bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Shivaji Kant <shivajikant@...gle.com>
---
net/core/devmem.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/net/core/devmem.c b/net/core/devmem.c
index d9de31a6cc7f..1d04754bc756 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -17,6 +17,7 @@
#include <net/page_pool/helpers.h>
#include <net/page_pool/memory_provider.h>
#include <net/sock.h>
+#include <net/tcp.h>
#include <trace/events/page_pool.h>
#include "devmem.h"
@@ -357,7 +358,8 @@ struct net_devmem_dmabuf_binding *net_devmem_get_binding(struct sock *sk,
unsigned int dmabuf_id)
{
struct net_devmem_dmabuf_binding *binding;
- struct dst_entry *dst = __sk_dst_get(sk);
+ struct net_device *dst_dev;
+ struct dst_entry *dst;
int err = 0;
binding = net_devmem_lookup_dmabuf(dmabuf_id);
@@ -366,16 +368,35 @@ struct net_devmem_dmabuf_binding *net_devmem_get_binding(struct sock *sk,
goto out_err;
}
+ rcu_read_lock();
+ dst = __sk_dst_get(sk);
+ /* If dst is NULL (route expired), attempt to rebuild it. */
+ if (unlikely(!dst)) {
+ if (inet_csk(sk)->icsk_af_ops->rebuild_header(sk)) {
+ err = -EHOSTUNREACH;
+ goto out_unlock;
+ }
+ dst = __sk_dst_get(sk);
+ if (unlikely(!dst)) {
+ err = -ENODEV;
+ goto out_unlock;
+ }
+ }
+
/* The dma-addrs in this binding are only reachable to the corresponding
* net_device.
*/
- if (!dst || !dst->dev || dst->dev->ifindex != binding->dev->ifindex) {
+ dst_dev = dst_dev_rcu(dst);
+ if (unlikely(!dst_dev) || unlikely(dst_dev != binding->dev)) {
err = -ENODEV;
- goto out_err;
+ goto out_unlock;
}
+ rcu_read_unlock();
return binding;
+out_unlock:
+ rcu_read_unlock();
out_err:
if (binding)
net_devmem_dmabuf_binding_put(binding);
--
2.51.1.838.g19442a804e-goog
Powered by blists - more mailing lists