[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzb9js_4UFChVWOjw52ik5TmNJroF5bXSicJtxyNZH8k3A@mail.gmail.com>
Date: Thu, 4 Aug 2022 12:03:04 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Martin KaFai Lau <kafai@...com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
David Miller <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, kernel-team@...com,
Paolo Abeni <pabeni@...hat.com>,
Stanislav Fomichev <sdf@...gle.com>
Subject: Re: [PATCH v2 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking
sk lock when called from bpf
On Wed, Aug 3, 2022 at 1:49 PM Martin KaFai Lau <kafai@...com> wrote:
>
> Most of the code in bpf_setsockopt(SOL_SOCKET) are duplicated from
> the sk_setsockopt(). The number of supported optnames are
> increasing ever and so as the duplicated code.
>
> One issue in reusing sk_setsockopt() is that the bpf prog
> has already acquired the sk lock. This patch adds a in_bpf()
> to tell if the sk_setsockopt() is called from a bpf prog.
> The bpf prog calling bpf_setsockopt() is either running in_task()
> or in_serving_softirq(). Both cases have the current->bpf_ctx
> initialized. Thus, the in_bpf() only needs to test !!current->bpf_ctx.
>
> This patch also adds sockopt_{lock,release}_sock() helpers
> for sk_setsockopt() to use. These helpers will test in_bpf()
> before acquiring/releasing the lock. They are in EXPORT_SYMBOL
> for the ipv6 module to use in a latter patch.
>
> Note on the change in sock_setbindtodevice(). sockopt_lock_sock()
> is done in sock_setbindtodevice() instead of doing the lock_sock
> in sock_bindtoindex(..., lock_sk = true).
>
> Signed-off-by: Martin KaFai Lau <kafai@...com>
> ---
> include/linux/bpf.h | 8 ++++++++
> include/net/sock.h | 3 +++
> net/core/sock.c | 26 +++++++++++++++++++++++---
> 3 files changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 20c26aed7896..b905b1b34fe4 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1966,6 +1966,10 @@ static inline bool unprivileged_ebpf_enabled(void)
> return !sysctl_unprivileged_bpf_disabled;
> }
>
> +static inline bool in_bpf(void)
I think this function deserves a big comment explaining that it's not
100% accurate, as not every BPF program type sets bpf_ctx. As it is
named in_bpf() promises a lot more generality than it actually
provides.
Should this be named either more specific has_current_bpf_ctx() maybe?
Also, separately, should be make an effort to set bpf_ctx for all
program types (instead or in addition to the above)?
> +{
> + return !!current->bpf_ctx;
> +}
> #else /* !CONFIG_BPF_SYSCALL */
> static inline struct bpf_prog *bpf_prog_get(u32 ufd)
> {
> @@ -2175,6 +2179,10 @@ static inline bool unprivileged_ebpf_enabled(void)
> return false;
> }
>
> +static inline bool in_bpf(void)
> +{
> + return false;
> +}
> #endif /* CONFIG_BPF_SYSCALL */
>
> void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
> diff --git a/include/net/sock.h b/include/net/sock.h
> index a7273b289188..b2ff230860c6 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1721,6 +1721,9 @@ static inline void unlock_sock_fast(struct sock *sk, bool slow)
> }
> }
>
> +void sockopt_lock_sock(struct sock *sk);
> +void sockopt_release_sock(struct sock *sk);
> +
> /* Used by processes to "lock" a socket state, so that
> * interrupts and bottom half handlers won't change it
> * from under us. It essentially blocks any incoming
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 20269c37ab3b..82759540ae2c 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -703,7 +703,9 @@ static int sock_setbindtodevice(struct sock *sk, sockptr_t optval, int optlen)
> goto out;
> }
>
> - return sock_bindtoindex(sk, index, true);
> + sockopt_lock_sock(sk);
> + ret = sock_bindtoindex_locked(sk, index);
> + sockopt_release_sock(sk);
> out:
> #endif
>
> @@ -1036,6 +1038,24 @@ static int sock_reserve_memory(struct sock *sk, int bytes)
> return 0;
> }
>
> +void sockopt_lock_sock(struct sock *sk)
> +{
> + if (in_bpf())
> + return;
> +
> + lock_sock(sk);
> +}
> +EXPORT_SYMBOL(sockopt_lock_sock);
> +
> +void sockopt_release_sock(struct sock *sk)
> +{
> + if (in_bpf())
> + return;
> +
> + release_sock(sk);
> +}
> +EXPORT_SYMBOL(sockopt_release_sock);
> +
> /*
> * This is meant for all protocols to use and covers goings on
> * at the socket level. Everything here is generic.
> @@ -1067,7 +1087,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
>
> valbool = val ? 1 : 0;
>
> - lock_sock(sk);
> + sockopt_lock_sock(sk);
>
> switch (optname) {
> case SO_DEBUG:
> @@ -1496,7 +1516,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
> ret = -ENOPROTOOPT;
> break;
> }
> - release_sock(sk);
> + sockopt_release_sock(sk);
> return ret;
> }
>
> --
> 2.30.2
>
Powered by blists - more mailing lists