[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZc_8FfHKA0rEvgx8T0xRWQp-2scm1N+nwroXi5enDh_g@mail.gmail.com>
Date: Mon, 13 May 2019 22:04:03 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: Alexei Starovoitov <ast@...nel.org>, Martin Lau <kafai@...com>,
bpf@...r.kernel.org, Networking <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf 1/3] bpf: add map_lookup_elem_sys_only for lookups
from syscall side
On Mon, May 13, 2019 at 4:20 PM Daniel Borkmann <daniel@...earbox.net> wrote:
>
> Add a callback map_lookup_elem_sys_only() that map implementations
> could use over map_lookup_elem() from system call side in case the
> map implementation needs to handle the latter differently than from
> the BPF data path. If map_lookup_elem_sys_only() is set, this will
> be preferred pick for map lookups out of user space. This hook is
This is kind of surprising behavior w/ preferred vs default lookup
code path. Why the desired behavior can't be achieved with an extra
flag, similar to BPF_F_LOCK? It seems like it will be more explicit,
more extensible and more generic approach, avoiding duplication of
lookup semantics.
E.g., for LRU map, with flag on lookup, one can decide whether lookup
from inside BPF program (not just from syscall side!) should modify
LRU ordering or not, simply by specifying extra flag. Am I missing
some complication that prevents us from doing it that way?
> used in a follow-up fix for LRU map, but once development window
> opens, we can convert other map types from map_lookup_elem() (here,
> the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use
> the callback to simplify and clean up the latter.
>
> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
> Acked-by: Martin KaFai Lau <kafai@...com>
> ---
> include/linux/bpf.h | 1 +
> kernel/bpf/syscall.c | 5 ++++-
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 59631dd..4fb3aa2 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -36,6 +36,7 @@ struct bpf_map_ops {
> void (*map_free)(struct bpf_map *map);
> int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key);
> void (*map_release_uref)(struct bpf_map *map);
> + void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
>
> /* funcs callable from userspace and from eBPF programs */
> void *(*map_lookup_elem)(struct bpf_map *map, void *key);
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index ad3ccf8..cb5440b 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -808,7 +808,10 @@ static int map_lookup_elem(union bpf_attr *attr)
> err = map->ops->map_peek_elem(map, value);
> } else {
> rcu_read_lock();
> - ptr = map->ops->map_lookup_elem(map, key);
> + if (map->ops->map_lookup_elem_sys_only)
> + ptr = map->ops->map_lookup_elem_sys_only(map, key);
> + else
> + ptr = map->ops->map_lookup_elem(map, key);
> if (IS_ERR(ptr)) {
> err = PTR_ERR(ptr);
> } else if (!ptr) {
> --
> 2.9.5
>
Powered by blists - more mailing lists