netdev - Re: [PATCH bpf-next 1/2] bpf: implement syscall command BPF_MAP_GET_NEXT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <78089cd2-9f12-267a-4b5c-d2e39e47932f@fb.com>
Date:   Thu, 4 Jan 2018 13:32:10 -0800
From:   Yonghong Song <yhs@...com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
CC:     <ast@...com>, <daniel@...earbox.net>, <netdev@...r.kernel.org>,
        <kernel-team@...com>
Subject: Re: [PATCH bpf-next 1/2] bpf: implement syscall command
 BPF_MAP_GET_NEXT_KEY for stacktrace map



On 1/4/18 1:08 PM, Jakub Kicinski wrote:
> On Wed, 3 Jan 2018 23:27:45 -0800, Yonghong Song wrote:
>> Currently, bpf syscall command BPF_MAP_GET_NEXT_KEY is not
>> supported for stacktrace map. However, there are use cases where
>> user space wants to enumerate all stacktrace map entries where
>> BPF_MAP_GET_NEXT_KEY command will be really helpful.
>> In addition, if user space wants to delete all map entries
>> in order to save memory and does not want to close the
>> map file descriptor, BPF_MAP_GET_NEXT_KEY may help improve
>> performance if map entries are sparsely populated.
>>
>> The implementation follows the API specification of existing
>> BPF_MAP_GET_NEXT_KEY implementation. If user provides
>> an NULL key pointer, the first key is returned. Otherwise,
>> the first valid key after the input parameter "key"
>> is returned, or -ENOENT if no valid key can be found.
>>
>> Signed-off-by: Yonghong Song <yhs@...com>
>> ---
>>   kernel/bpf/stackmap.c | 23 +++++++++++++++++++++--
>>   1 file changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>> index a15bc63..207b21c 100644
>> --- a/kernel/bpf/stackmap.c
>> +++ b/kernel/bpf/stackmap.c
>> @@ -226,9 +226,28 @@ int bpf_stackmap_copy(struct bpf_map *map, void *key, void *value)
>>   	return 0;
>>   }
>>   
>> -static int stack_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
>> +static int stack_map_get_next_key(struct bpf_map *map, void *key,
>> +				  void *next_key)
>>   {
>> -	return -EINVAL;
>> +	struct bpf_stack_map *smap = container_of(map,
>> +						  struct bpf_stack_map, map);
>> +	u32 id;
>> +
>> +	WARN_ON_ONCE(!rcu_read_lock_held());
>> +
>> +	if (!key)
>> +		id = 0;
>> +	else
>> +		id = *(u32 *)key + 1;
>> +
>> +	while (id < smap->n_buckets && !smap->buckets[id])
>> +		id++;
>> +
>> +	if (id >= smap->n_buckets)
>> +		return -ENOENT;
> 
> AFAIU for hash maps the semantics of get next are as follows:
> 
> get_next(map, key) {
> 	if (!key)
> 		return get_first(map);
> 
> 	elem = lookup(map, key);
> 	if (!elem)                       // <-- note this branch
> 		return get_first(map);
> 	if (elem->next)
> 		return elem->next->key;
> 	return -ENOENT;
> }
> 
> For arrays elements always exist, hence the elem->next check is
> omitted.  Here you are, however, testing !smap->buckets[id] so I assume
> elements may not exist.

Right, buckets[id] could be NULL.

> 
> Is there any precedent for get_next on non-existent key returning
> element other than first?  The stacktrace map is a bit special, and

Sorry, I miss this. You are right. hashtable get_next_key will return 
the first for non-existing key. And all other implemented get_next_key
is a variant of arrays where all keys already exist.

> returning id + 1 would defeat what you're trying to do here..  Is there
> value in keeping the behaviour consistent across map types?

Let us keep the behavior consistent with hashtable then.

> 
> Anyway, you said in the commit message that "The implementation follows
> the API specification of existing BPF_MAP_GET_NEXT_KEY implementation."
> and I find that arguable :)

You are right. Will send v2 soon with re-wording of commit message as well.

>> +	*(u32 *)next_key = id;
>> +	return 0;
>>   }
>>   
>>   static int stack_map_update_elem(struct bpf_map *map, void *key, void *value,
>