netdev - Re: [PATCH bpf-next 1/2] bpf: implement syscall command BPF_MAP_GET_NEXT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180104130822.377721d8@cakuba.netronome.com>
Date:   Thu, 4 Jan 2018 13:08:22 -0800
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Yonghong Song <yhs@...com>
Cc:     <ast@...com>, <daniel@...earbox.net>, <netdev@...r.kernel.org>,
        <kernel-team@...com>
Subject: Re: [PATCH bpf-next 1/2] bpf: implement syscall command
 BPF_MAP_GET_NEXT_KEY for stacktrace map

On Wed, 3 Jan 2018 23:27:45 -0800, Yonghong Song wrote:
> Currently, bpf syscall command BPF_MAP_GET_NEXT_KEY is not
> supported for stacktrace map. However, there are use cases where
> user space wants to enumerate all stacktrace map entries where
> BPF_MAP_GET_NEXT_KEY command will be really helpful.
> In addition, if user space wants to delete all map entries
> in order to save memory and does not want to close the
> map file descriptor, BPF_MAP_GET_NEXT_KEY may help improve
> performance if map entries are sparsely populated.
> 
> The implementation follows the API specification of existing
> BPF_MAP_GET_NEXT_KEY implementation. If user provides
> an NULL key pointer, the first key is returned. Otherwise,
> the first valid key after the input parameter "key"
> is returned, or -ENOENT if no valid key can be found.
> 
> Signed-off-by: Yonghong Song <yhs@...com>
> ---
>  kernel/bpf/stackmap.c | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index a15bc63..207b21c 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -226,9 +226,28 @@ int bpf_stackmap_copy(struct bpf_map *map, void *key, void *value)
>  	return 0;
>  }
>  
> -static int stack_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
> +static int stack_map_get_next_key(struct bpf_map *map, void *key,
> +				  void *next_key)
>  {
> -	return -EINVAL;
> +	struct bpf_stack_map *smap = container_of(map,
> +						  struct bpf_stack_map, map);
> +	u32 id;
> +
> +	WARN_ON_ONCE(!rcu_read_lock_held());
> +
> +	if (!key)
> +		id = 0;
> +	else
> +		id = *(u32 *)key + 1;
> +
> +	while (id < smap->n_buckets && !smap->buckets[id])
> +		id++;
> +
> +	if (id >= smap->n_buckets)
> +		return -ENOENT;

AFAIU for hash maps the semantics of get next are as follows:

get_next(map, key) {
	if (!key)
		return get_first(map);

	elem = lookup(map, key);
	if (!elem)                       // <-- note this branch
		return get_first(map);
	if (elem->next)
		return elem->next->key;
	return -ENOENT;
}

For arrays elements always exist, hence the elem->next check is
omitted.  Here you are, however, testing !smap->buckets[id] so I assume
elements may not exist.  

Is there any precedent for get_next on non-existent key returning
element other than first?  The stacktrace map is a bit special, and
returning id + 1 would defeat what you're trying to do here..  Is there
value in keeping the behaviour consistent across map types?  

Anyway, you said in the commit message that "The implementation follows
the API specification of existing BPF_MAP_GET_NEXT_KEY implementation."
and I find that arguable :)

> +	*(u32 *)next_key = id;
> +	return 0;
>  }
>  
>  static int stack_map_update_elem(struct bpf_map *map, void *key, void *value,