[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2ee3486d-3813-4eb1-99e1-edbd4c9dac6a@huawei.com>
Date: Fri, 7 Mar 2025 21:18:14 +0800
From: "dongchenchen (A)" <dongchenchen2@...wei.com>
To: Jiayuan Chen <mrpre@....com>, Cong Wang <xiyou.wangcong@...il.com>
CC: <edumazet@...gle.com>, <kuniyu@...zon.com>, <pabeni@...hat.com>,
<willemb@...gle.com>, <john.fastabend@...il.com>, <jakub@...udflare.com>,
<davem@...emloft.net>, <kuba@...nel.org>, <horms@...nel.org>,
<daniel@...earbox.net>, <netdev@...r.kernel.org>, <bpf@...r.kernel.org>,
<zhangchangzhong@...wei.com>, <weiyongjun1@...wei.com>
Subject: Re: [PATCH net] bpf, sockmap: Restore sk_prot ops when psock is
removed from sockmap
On 2025/3/6 13:06, Jiayuan Chen wrote:
> On Wed, Mar 05, 2025 at 10:12:43AM +0800, Cong Wang wrote:
>> On Wed, Mar 05, 2025 at 10:02:34PM +0800, Dong Chenchen wrote:
>>> WARNING: CPU: 0 PID: 6558 at net/core/sock_map.c:1703 sock_map_close+0x3c4/0x480
>>> Modules linked in:
>>> CPU: 0 UID: 0 PID: 6558 Comm: syz-executor.14 Not tainted 6.14.0-rc5+ #238
>>> RIP: 0010:sock_map_close+0x3c4/0x480
>>> Call Trace:
>>> <TASK>
>>> inet_release+0x144/0x280
>>> __sock_release+0xb8/0x270
>>> sock_close+0x1e/0x30
>>> __fput+0x3c6/0xb30
>>> __fput_sync+0x7b/0x90
>>> __x64_sys_close+0x90/0x120
>>> do_syscall_64+0x5d/0x170
>>> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>
>>> The root cause is:
>>> sock_hash_update_common
>>> sock_map_unref
>>> sock_map_del_link
>>> psock->psock_update_sk_prot(sk, psock, false);
>>> //false won't restore proto
>>> sk_psock_put
>>> rcu_assign_sk_user_data(sk, NULL);
>>> inet_release
>>> sk->sk_prot->close
>>> sock_map_close
>>> WARN(sk->sk_prot->close == sock_map_close)
>>>
>>> When psock is removed from sockmap, sock_map_del_link() still set
>>> sk->sk_prot to bpf proto instead of restore it (for incorrect restore
>>> value). sock release will triger warning of sock_map_close() for
>>> recurse after psock drop.
>> But sk_psock_drop() restores it with sk_psock_restore_proto() after the
>> psock reference count goes to zero. So how could the above happen?
>>
>> By the way, it would be perfect if you could add a test case for it
>> together with this patch (a followup patch is fine too).
>>
>> Thanks!
> I also have the same question as Cong, and I'll describe it in more detail
> here:
>
> 'psock->saved_close' is always tcp_close (if your socket is TCP) and will
> not change regardless of whether restore is executed or not. So when
> entering the function sock_map_close() and encountering
> WARN_ON_ONCE(saved_close == sock_map_close), 'saved_close' can only come
> from 'saved_close = READ_ONCE(sk->sk_prot)->close'. This means we obtain
> sock through psock = sk_psock(sk) and then enter the branch code after
> judging it to be null.
> '''
> sock_map_close()
> {
> psock = sk_psock(sk);
> if (likely(psock)) {
> saved_close = psock->saved_close;
> } else {
> saved_close = READ_ONCE(sk->sk_prot)->close;
> }
> WARN_ON_ONCE(saved_close == sock_map_close);
> }
> '''
> However, before psock becomes null, we have actually successfully executed
> the restore:
> '''
> void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
> {
> write_lock_bh(&sk->sk_callback_lock);
> sk_psock_restore_proto(sk, psock); // restore correctly
> rcu_assign_sk_user_data(sk, NULL); // set psock null
> ...
> }
> '''
>
> Passing false to psock_update_sk_prot may be problematic, but it shouldn't
> cause the issue described in the commit message.
> It may be necessary to provide more information on how sockmap is used to
> determine the issue. :)
>
> Thanks.
Thanks for your review!
sk_psock_restore_protofail to restore sk_prot for tcp_ulp set concurrently.
sock_hash_update_common
sock_map_unref
sock_map_del_link
setsockopt(TCP_ULP)
tcp_set_ulp
icsk->icsk_ulp_ops = ulp_ops; //set ulp
sk_psock_put
sk_psock_restore_proto
tcp_bpf_update_proto
if (inet_csk_has_ulp(sk))
tls_update
ctx->sk_proto = p; //not update sk->sk_prot for ulp set
inet_release
sk->sk_prot->close
sock_map_close
WARN(sk->sk_prot->close == sock_map_close)
Maybe we can fix it as:
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 99ca4465f702..791cc32dccfd 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -1025,15 +1025,22 @@ static int tls_init(struct sock *sk)
static void tls_update(struct sock *sk, struct proto *p,
void (*write_space)(struct sock *sk))
{
+ struct sk_psock *psock = sk_psock(sk);
struct tls_context *ctx;
+ bool restore = true;
WARN_ON_ONCE(sk->sk_prot == p);
+ if (psock)
+ restore = !!refcount_read(&psock->refcnt);
+
ctx = tls_get_ctx(sk);
if (likely(ctx)) {
ctx->sk_write_space = write_space;
ctx->sk_proto = p;
- } else {
+ }
+
+ if (ctx == NULL || !restore) {
/* Pairs with lockless read in sk_clone_lock(). */
WRITE_ONCE(sk->sk_prot, p);
sk->sk_write_space = write_space;
Powered by blists - more mailing lists