[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAH2r5mtoz+4RSDLJijNFD6dRiLTWKou8m3M2mTp2cy7oPsP=Qg@mail.gmail.com>
Date: Wed, 18 Dec 2024 18:28:38 -0600
From: Steve French <smfrench@...il.com>
To: linux-fsdevel <linux-fsdevel@...r.kernel.org>, linux-net@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>
Cc: Enzo Matsumiya <ematsumiya@...e.de>
Subject: Fwd: [PATCH][SMB3 client] fix TCP timers deadlock after rmmod
Adding fsdevel and networking in case any thoughts on this fix for
network/namespaces refcount issue (that causes rmmod UAF).
Any opinions on Enzo's proposed Fix?
---------- Forwarded message ---------
From: Steve French <smfrench@...il.com>
Date: Tue, Dec 17, 2024 at 9:24 PM
Subject: [PATCH][SMB3 client] fix TCP timers deadlock after rmmod
To: CIFS <linux-cifs@...r.kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@...zon.com>, Enzo Matsumiya <ematsumiya@...e.de>
Enzo had an interesting patch, that seems to fix an important problem.
Here was his repro scenario:
tw:~ # mount.cifs -o credentials=/root/wincreds,echo_interval=10
//someserver/target1 /mnt/test
tw:~ # ls /mnt/test
abc dir1 dir3 target1_file.txt tsub
tw:~ # iptables -A INPUT -s someserver -j DROP
Trigger reconnect and wait for 3*echo_interval:
tw:~ # cat /mnt/test/target1_file.txt
cat: /mnt/test/target1_file.txt: Host is down
Then umount and rmmod. Note that rmmod might take several iterations
until it properly tears down everything, so make sure you see the "not
loaded" message before proceeding:
tw:~ # umount /mnt/*; rmmod cifs
umount: /mnt/az: not mounted.
umount: /mnt/dfs: not mounted.
umount: /mnt/local: not mounted.
umount: /mnt/scratch: not mounted.
rmmod: ERROR: Module cifs is in use
...
tw:~ # rmmod cifs
rmmod: ERROR: Module cifs is not currently loaded
Then kickoff the TCP internals:
tw:~ # iptables -F
Gets the lockdep warning (requires CONFIG_LOCKDEP=y) + a NULL deref
later on.
Any thoughts on his patch? See below (and attached)
Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
namespace.")
fixed a netns UAF by manually enabled socket refcounting
(sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)).
The reason the patch worked for that bug was because we now hold
references to the netns (get_net_track() gets a ref internally)
and they're properly released (internally, on __sk_destruct()),
but only because sk->sk_net_refcnt was set.
Problem:
(this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless
if init_net or other)
Setting sk->sk_net_refcnt=1 *manually* and *after* socket creation is not
only out of cifs scope, but also technically wrong -- it's set conditionally
based on user (=1) vs kernel (=0) sockets. And net/ implementations
seem to base their user vs kernel space operations on it.
e.g. upon TCP socket close, the TCP timers are not cleared because
sk->sk_net_refcnt=1:
(cf. commit 151c9c724d05 ("tcp: properly terminate timers for
kernel sockets"))
net/ipv4/tcp.c:
void tcp_close(struct sock *sk, long timeout)
{
lock_sock(sk);
__tcp_close(sk, timeout);
release_sock(sk);
if (!sk->sk_net_refcnt)
inet_csk_clear_xmit_timers_sync(sk);
sock_put(sk);
}
Which will throw a lockdep warning and then, as expected, deadlock on
tcp_write_timer().
A way to reproduce this is by running the reproducer from ef7134c7fc48
and then 'rmmod cifs'. A few seconds later, the deadlock/lockdep
warning shows up.
Fix:
We shouldn't mess with socket internals ourselves, so do not set
sk_net_refcnt manually.
Also change __sock_create() to sock_create_kern() for explicitness.
As for non-init_net network namespaces, we deal with it the best way
we can -- hold an extra netns reference for server->ssocket and drop it
when it's released. This ensures that the netns still exists whenever
we need to create/destroy server->ssocket, but is not directly tied to
it.
Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network
namespace.")
--
Thanks,
Steve
--
Thanks,
Steve
Download attachment "0001-smb-client-fix-TCP-timers-deadlock-after-rmmod.patch" of type "application/x-patch" (5831 bytes)
Powered by blists - more mailing lists