[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220803081413.3cc27002@kernel.org>
Date: Wed, 3 Aug 2022 08:14:13 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Hawkins Jiawei <yin31149@...il.com>, kafai@...com
Cc: syzbot+5f26f85569bd179c18ce@...kaller.appspotmail.com,
18801353760@....com, andrii@...nel.org, ast@...nel.org,
borisp@...dia.com, bpf@...r.kernel.org, daniel@...earbox.net,
davem@...emloft.net, edumazet@...gle.com, jakub@...udflare.com,
john.fastabend@...il.com, kgraul@...ux.ibm.com, kpsingh@...nel.org,
linux-kernel-mentees@...ts.linuxfoundation.org,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
pabeni@...hat.com, paskripkin@...il.com, skhan@...uxfoundation.org,
songliubraving@...com, syzkaller-bugs@...glegroups.com, yhs@...com,
Wen Gu <guwen@...ux.alibaba.com>
Subject: Re: [PATCH v4] net: fix refcount bug in sk_psock_get (2)
On Wed, 3 Aug 2022 20:41:22 +0800 Hawkins Jiawei wrote:
> -/* Pointer stored in sk_user_data might not be suitable for copying
> - * when cloning the socket. For instance, it can point to a reference
> - * counted object. sk_user_data bottom bit is set if pointer must not
> - * be copied.
> +/* flag bits in sk_user_data
> + *
> + * SK_USER_DATA_NOCOPY - Pointer stored in sk_user_data might
> + * not be suitable for copying when cloning the socket.
> + * For instance, it can point to a reference counted object.
> + * sk_user_data bottom bit is set if pointer must not be copied.
> + *
> + * SK_USER_DATA_BPF - Managed by BPF
I'd use this opportunity to add more info here, BPF is too general.
Maybe "Pointer is used by a BPF reuseport array"? Martin, WDYT?
> + * SK_USER_DATA_PSOCK - Mark whether pointer stored in sk_user_data points
> + * to psock type. This bit should be set when sk_user_data is
> + * assigned to a psock object.
> +/**
> + * rcu_dereference_sk_user_data_psock - return psock if sk_user_data
> + * points to the psock type(SK_USER_DATA_PSOCK flag is set), otherwise
> + * return NULL
> + *
> + * @sk: socket
> + */
> +static inline
> +struct sk_psock *rcu_dereference_sk_user_data_psock(const struct sock *sk)
nit: the return type more commonly goes on the same line as "static
inline"
> +{
> + uintptr_t __tmp = (uintptr_t)rcu_dereference(__sk_user_data((sk)));
> +
> + if (__tmp & SK_USER_DATA_PSOCK)
> + return (struct sk_psock *)(__tmp & SK_USER_DATA_PTRMASK);
> +
> + return NULL;
> +}
As a follow up we can probably generalize this into
__rcu_dereference_sk_user_data_cond(sk, bit)
and make the psock just call that:
static inline struct sk_psock *
rcu_dereference_sk_user_data_psock(const struct sock *sk)
{
return __rcu_dereference_sk_user_data_cond(sk, SK_USER_DATA_PSOCK);
}
then reuseport can also benefit, maybe:
diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index e2618fb5870e..ad5c447a690c 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -21,14 +21,11 @@ static struct reuseport_array *reuseport_array(struct bpf_map *map)
/* The caller must hold the reuseport_lock */
void bpf_sk_reuseport_detach(struct sock *sk)
{
- uintptr_t sk_user_data;
+ struct sock __rcu **socks;
write_lock_bh(&sk->sk_callback_lock);
- sk_user_data = (uintptr_t)sk->sk_user_data;
- if (sk_user_data & SK_USER_DATA_BPF) {
- struct sock __rcu **socks;
-
- socks = (void *)(sk_user_data & SK_USER_DATA_PTRMASK);
+ socks = __rcu_dereference_sk_user_data_cond(sk, SK_USER_DATA_BPF);
+ if (socks) {
WRITE_ONCE(sk->sk_user_data, NULL);
/*
* Do not move this NULL assignment outside of
But that must be a separate patch, not part of this fix.
Powered by blists - more mailing lists