[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <78089cd2-9f12-267a-4b5c-d2e39e47932f@fb.com>
Date: Thu, 4 Jan 2018 13:32:10 -0800
From: Yonghong Song <yhs@...com>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
CC: <ast@...com>, <daniel@...earbox.net>, <netdev@...r.kernel.org>,
<kernel-team@...com>
Subject: Re: [PATCH bpf-next 1/2] bpf: implement syscall command
BPF_MAP_GET_NEXT_KEY for stacktrace map
On 1/4/18 1:08 PM, Jakub Kicinski wrote:
> On Wed, 3 Jan 2018 23:27:45 -0800, Yonghong Song wrote:
>> Currently, bpf syscall command BPF_MAP_GET_NEXT_KEY is not
>> supported for stacktrace map. However, there are use cases where
>> user space wants to enumerate all stacktrace map entries where
>> BPF_MAP_GET_NEXT_KEY command will be really helpful.
>> In addition, if user space wants to delete all map entries
>> in order to save memory and does not want to close the
>> map file descriptor, BPF_MAP_GET_NEXT_KEY may help improve
>> performance if map entries are sparsely populated.
>>
>> The implementation follows the API specification of existing
>> BPF_MAP_GET_NEXT_KEY implementation. If user provides
>> an NULL key pointer, the first key is returned. Otherwise,
>> the first valid key after the input parameter "key"
>> is returned, or -ENOENT if no valid key can be found.
>>
>> Signed-off-by: Yonghong Song <yhs@...com>
>> ---
>> kernel/bpf/stackmap.c | 23 +++++++++++++++++++++--
>> 1 file changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>> index a15bc63..207b21c 100644
>> --- a/kernel/bpf/stackmap.c
>> +++ b/kernel/bpf/stackmap.c
>> @@ -226,9 +226,28 @@ int bpf_stackmap_copy(struct bpf_map *map, void *key, void *value)
>> return 0;
>> }
>>
>> -static int stack_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
>> +static int stack_map_get_next_key(struct bpf_map *map, void *key,
>> + void *next_key)
>> {
>> - return -EINVAL;
>> + struct bpf_stack_map *smap = container_of(map,
>> + struct bpf_stack_map, map);
>> + u32 id;
>> +
>> + WARN_ON_ONCE(!rcu_read_lock_held());
>> +
>> + if (!key)
>> + id = 0;
>> + else
>> + id = *(u32 *)key + 1;
>> +
>> + while (id < smap->n_buckets && !smap->buckets[id])
>> + id++;
>> +
>> + if (id >= smap->n_buckets)
>> + return -ENOENT;
>
> AFAIU for hash maps the semantics of get next are as follows:
>
> get_next(map, key) {
> if (!key)
> return get_first(map);
>
> elem = lookup(map, key);
> if (!elem) // <-- note this branch
> return get_first(map);
> if (elem->next)
> return elem->next->key;
> return -ENOENT;
> }
>
> For arrays elements always exist, hence the elem->next check is
> omitted. Here you are, however, testing !smap->buckets[id] so I assume
> elements may not exist.
Right, buckets[id] could be NULL.
>
> Is there any precedent for get_next on non-existent key returning
> element other than first? The stacktrace map is a bit special, and
Sorry, I miss this. You are right. hashtable get_next_key will return
the first for non-existing key. And all other implemented get_next_key
is a variant of arrays where all keys already exist.
> returning id + 1 would defeat what you're trying to do here.. Is there
> value in keeping the behaviour consistent across map types?
Let us keep the behavior consistent with hashtable then.
>
> Anyway, you said in the commit message that "The implementation follows
> the API specification of existing BPF_MAP_GET_NEXT_KEY implementation."
> and I find that arguable :)
You are right. Will send v2 soon with re-wording of commit message as well.
>> + *(u32 *)next_key = id;
>> + return 0;
>> }
>>
>> static int stack_map_update_elem(struct bpf_map *map, void *key, void *value,
>
Powered by blists - more mailing lists