[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251114010644.452441-1-m.lobanov@rosa.ru>
Date: Fri, 14 Nov 2025 04:06:42 +0300
From: Mikhail Lobanov <m.lobanov@...a.ru>
To: "David S. Miller" <davem@...emloft.net>
Cc: Mikhail Lobanov <m.lobanov@...a.ru>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
David Bauer <mail@...id-bauer.net>,
James Chapman <jchapman@...alix.com>,
netdev@...r.kernel.org,
linux-kernel@...r.kernel.org,
lvc-project@...uxtesting.org
Subject: [PATCH net v3] l2tp: fix double dst_release() on sk_dst_cache race
A reproducible rcuref - imbalanced put() warning is observed under
IPv6 L2TP (pppol2tp) traffic with blackhole routes, indicating an
imbalance in dst reference counting for routes cached in
sk->sk_dst_cache and pointing to a subtle lifetime/synchronization
issue between the helpers that validate and drop cached dst entries.
rcuref - imbalanced put()
WARNING: CPU: 0 PID: 899 at lib/rcuref.c:266 rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.>
Modules linked in:
CPSocket connected tcp:127.0.0.1:48148,server=on <-> 127.0.0.1:33750
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01>
RIP: 0010:rcuref_put_slowpath+0x1ce/0x240 lib/rcuref.c:266
Call Trace:
<TASK>
__rcuref_put include/linux/rcuref.h:97 [inline]
rcuref_put include/linux/rcuref.h:153 [inline]
dst_release+0x291/0x310 net/core/dst.c:167
__sk_dst_check+0x2d4/0x350 net/core/sock.c:604
__inet6_csk_dst_check net/ipv6/inet6_connection_sock.c:76 [inline]
inet6_csk_route_socket+0x6ed/0x10c0 net/ipv6/inet6_connection_sock.c:104
inet6_csk_xmit+0x12f/0x740 net/ipv6/inet6_connection_sock.c:121
l2tp_xmit_queue net/l2tp/l2tp_core.c:1214 [inline]
l2tp_xmit_core net/l2tp/l2tp_core.c:1309 [inline]
l2tp_xmit_skb+0x1404/0x1910 net/l2tp/l2tp_core.c:1325
pppol2tp_sendmsg+0x3ca/0x550 net/l2tp/l2tp_ppp.c:302
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg net/socket.c:744 [inline]
____sys_sendmsg+0xab2/0xc70 net/socket.c:2609
___sys_sendmsg+0x11d/0x1c0 net/socket.c:2663
__sys_sendmmsg+0x188/0x450 net/socket.c:2749
__do_sys_sendmmsg net/socket.c:2778 [inline]
__se_sys_sendmmsg net/socket.c:2775 [inline]
__x64_sys_sendmmsg+0x98/0x100 net/socket.c:2775
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x64/0x140 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fe6960ec719
</TASK>
The race occurs between the lockless UDPv6 transmit path
(udpv6_sendmsg() -> sk_dst_check()) and the locked L2TP/pppol2tp
transmit path (pppol2tp_sendmsg() -> l2tp_xmit_skb() ->
... -> inet6_csk_xmit() → __sk_dst_check()), when both handle
the same obsolete dst from sk->sk_dst_cache: the UDPv6 side takes
an extra reference and atomically steals and releases the cached
dst, while the L2TP side, using a stale cached pointer, still
calls dst_release() on it, and together these updates produce
an extra final dst_release() on that dst, triggering
rcuref - imbalanced put().
The Race Condition:
Initial:
sk->sk_dst_cache = dst
ref(dst) = 1
Thread 1: sk_dst_check() Thread 2: __sk_dst_check()
------------------------ ----------------------------
sk_dst_get(sk):
rcu_read_lock()
dst = rcu_dereference(sk->sk_dst_cache)
rcuref_get(dst) succeeds
rcu_read_unlock()
// ref = 2
dst = __sk_dst_get(sk)
// reads same dst from sk_dst_cache
// ref still = 2 (no extra get)
[both see dst obsolete & check() == NULL]
sk_dst_reset(sk):
old = xchg(&sk->sk_dst_cache, NULL)
// old = dst
dst_release(old)
// drop cached ref
// ref: 2 -> 1
RCU_INIT_POINTER(sk->sk_dst_cache, NULL)
// cache already NULL after xchg
dst_release(dst)
// ref: 1 -> 0
dst_release(dst)
// tries to drop its own ref after final put
// rcuref_put_slowpath() -> "rcuref - imbalanced put()"
Make the IPv6 L2TP transmit path explicitly pre-validate
the cached dst under the socket lock by reading sk->sk_dst_cache
via __sk_dst_get() and, if dst is obsolete and
dst->ops->check(dst, inet6_sk(tunnel->sock)->dst_cookie) == NULL, drop it
from the cache with sk_dst_reset(tunnel->sock), so that the cache-owned
reference is released exactly once via the xchg-based helper
and later __sk_dst_check() inside inet6_csk_xmit() either
sees a NULL cache or a still-valid dst and no longer races
with the lockless sk_dst_check() to double-release the
same cached dst.
Fixes: b0270e91014d ("ipv4: add a sock pointer to ip_queue_xmit()")
Signed-off-by: Mikhail Lobanov <m.lobanov@...a.ru>
---
v2: move fix to L2TP as suggested by Eric Dumazet.
v3: dropped the lockless sk_dst_check() pre-validation
and the extra sk_dst_get() reference; instead, under
the socket lock, mirror __sk_dst_check()’s condition
and invalidate the cached dst via sk_dst_reset(sk) so
the cache-owned ref is released exactly once via the
xchg-based helper.
net/l2tp/l2tp_core.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 0710281dd95a..b379b7e6470a 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1210,9 +1210,17 @@ static int l2tp_xmit_queue(struct l2tp_tunnel *tunnel, struct sk_buff *skb, stru
skb->ignore_df = 1;
skb_dst_drop(skb);
#if IS_ENABLED(CONFIG_IPV6)
- if (l2tp_sk_is_v6(tunnel->sock))
+ if (l2tp_sk_is_v6(tunnel->sock)) {
+ struct dst_entry *dst = __sk_dst_get(tunnel->sock);
+
+ if (dst) {
+ if (dst && READ_ONCE(dst->obsolete) &&
+ dst->ops->check(dst,
+ inet6_sk(tunnel->sock)->dst_cookie) == NULL)
+ sk_dst_reset(tunnel->sock);
+ }
err = inet6_csk_xmit(tunnel->sock, skb, NULL);
- else
+ } else
#endif
err = ip_queue_xmit(tunnel->sock, skb, fl);
--
2.47.2
Powered by blists - more mailing lists