[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200113234541.sru7domciovzijnx@kafai-mbp.dhcp.thefacebook.com>
Date: Mon, 13 Jan 2020 23:45:45 +0000
From: Martin Lau <kafai@...com>
To: Jakub Sitnicki <jakub@...udflare.com>
CC: "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kernel-team@...udflare.com" <kernel-team@...udflare.com>,
Eric Dumazet <edumazet@...gle.com>,
"John Fastabend" <john.fastabend@...il.com>,
Lorenz Bauer <lmb@...udflare.com>
Subject: Re: [PATCH bpf-next v2 09/11] bpf: Allow selecting reuseport socket
from a SOCKMAP
On Fri, Jan 10, 2020 at 11:50:25AM +0100, Jakub Sitnicki wrote:
> SOCKMAP now supports storing references to listening sockets. Nothing keeps
> us from using it as an array of sockets to select from in SK_REUSEPORT
> programs.
>
> Whitelist the map type with the BPF helper for selecting socket.
>
> The restriction that the socket has to be a member of a reuseport group
> still applies. Socket from a SOCKMAP that does not have sk_reuseport_cb set
> is not a valid target and we signal it with -EINVAL.
>
> Signed-off-by: Jakub Sitnicki <jakub@...udflare.com>
> ---
> kernel/bpf/verifier.c | 6 ++++--
> net/core/filter.c | 15 ++++++++++-----
> 2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index f5af759a8a5f..0ee5f1594b5c 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3697,7 +3697,8 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
> if (func_id != BPF_FUNC_sk_redirect_map &&
> func_id != BPF_FUNC_sock_map_update &&
> func_id != BPF_FUNC_map_delete_elem &&
> - func_id != BPF_FUNC_msg_redirect_map)
> + func_id != BPF_FUNC_msg_redirect_map &&
> + func_id != BPF_FUNC_sk_select_reuseport)
> goto error;
> break;
> case BPF_MAP_TYPE_SOCKHASH:
> @@ -3778,7 +3779,8 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
> goto error;
> break;
> case BPF_FUNC_sk_select_reuseport:
> - if (map->map_type != BPF_MAP_TYPE_REUSEPORT_SOCKARRAY)
> + if (map->map_type != BPF_MAP_TYPE_REUSEPORT_SOCKARRAY &&
> + map->map_type != BPF_MAP_TYPE_SOCKMAP)
> goto error;
> break;
> case BPF_FUNC_map_peek_elem:
> diff --git a/net/core/filter.c b/net/core/filter.c
> index a702761ef369..c79c62a54167 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -8677,6 +8677,7 @@ struct sock *bpf_run_sk_reuseport(struct sock_reuseport *reuse, struct sock *sk,
> BPF_CALL_4(sk_select_reuseport, struct sk_reuseport_kern *, reuse_kern,
> struct bpf_map *, map, void *, key, u32, flags)
> {
> + bool is_sockarray = map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY;
> struct sock_reuseport *reuse;
> struct sock *selected_sk;
>
> @@ -8685,12 +8686,16 @@ BPF_CALL_4(sk_select_reuseport, struct sk_reuseport_kern *, reuse_kern,
> return -ENOENT;
>
> reuse = rcu_dereference(selected_sk->sk_reuseport_cb);
> - if (!reuse)
> - /* selected_sk is unhashed (e.g. by close()) after the
> - * above map_lookup_elem(). Treat selected_sk has already
> - * been removed from the map.
> + if (!reuse) {
> + /* reuseport_array has only sk with non NULL sk_reuseport_cb.
> + * The only (!reuse) case here is - the sk has already been
> + * unhashed (e.g. by close()), so treat it as -ENOENT.
> + *
> + * Other maps (e.g. sock_map) do not provide this guarantee and
> + * the sk may never be in the reuseport group to begin with.
> */
> - return -ENOENT;
> + return is_sockarray ? -ENOENT : -EINVAL;
> + }
>
> if (unlikely(reuse->reuseport_id != reuse_kern->reuseport_id)) {
I guess the later testing patch passed is because reuseport_id is init to 0.
Note that in reuseport_array, reuseport_get_id() is called at update_elem() to
init the reuse->reuseport_id. It was done there because reuseport_array
was the only one requiring reuseport_id. It is to ensure the bpf_prog
cannot accidentally use a sk from another reuseport-group.
The same has to be done in patch 5 or may be considering to
move it to reuseport_alloc() itself.
Powered by blists - more mailing lists