[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57d0b912-2444-9f95-897f-f83affcef57e@polito.it>
Date: Wed, 3 Oct 2018 12:01:37 -0500
From: Mauricio Vasquez <mauricio.vasquez@...ito.it>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: daniel@...earbox.net, netdev@...r.kernel.org
Subject: Re: [RFC PATCH bpf-next v3 4/7] bpf: add bpf queue and stack maps
On 10/01/2018 07:26 PM, Alexei Starovoitov wrote:
> On Mon, Oct 01, 2018 at 08:11:43AM -0500, Mauricio Vasquez wrote:
>>>>> +BPF_CALL_3(bpf_map_pop_elem, struct bpf_map *, map, void *,
>>>>> value, u32, size)
>>>>> +{
>>>>> + void *ptr;
>>>>> +
>>>>> + if (map->value_size != size)
>>>>> + return -EINVAL;
>>>>> +
>>>>> + ptr = map->ops->map_lookup_and_delete_elem(map, NULL);
>>>>> + if (!ptr)
>>>>> + return -ENOENT;
>>>>> +
>>>>> + switch (size) {
>>>>> + case 1:
>>>>> + *(u8 *) value = *(u8 *) ptr;
>>>>> + break;
>>>>> + case 2:
>>>>> + *(u16 *) value = *(u16 *) ptr;
>>>>> + break;
>>>>> + case 4:
>>>>> + *(u32 *) value = *(u32 *) ptr;
>>>>> + break;
>>>>> + case 8:
>>>>> + *(u64 *) value = *(u64 *) ptr;
>>>>> + break;
>>>> this is inefficient. can we pass value ptr into ops and let it
>>>> populate it?
>>> I don't think so, doing that implies that look_and_delete will be a
>>> per-value op, while other ops in maps are per-reference.
>>> For instance, how to change it in the case of peek helper that is using
>>> the lookup operation?, we cannot change the signature of the lookup
>>> operation.
>>>
>>> This is something that worries me a little bit, we are creating new
>>> per-value helpers based on already existing per-reference operations,
>>> this is not probably the cleanest way. Here we are at the beginning of
>>> the discussion once again, how should we map helpers and syscalls to
>>> ops.
>>>
>>> What about creating pop/peek/push ops, mapping helpers one to one and
>>> adding some logic into syscall.c to call the correct operation in case
>>> the map is stack/queue?
>>> Syscall mapping would be:
>>> bpf_map_lookup_elem() -> peek
>>> bpf_map_lookup_and_delete_elem() -> pop
>>> bpf_map_update_elem() -> push
>>>
>>> Does it make sense?
>> Hello Alexei,
>>
>> Do you have any feedback on this specific part?
> Indeed. It seems push/pop ops will be cleaner.
> I still think that peek() is useless due to races.
> So BPF_MAP_UPDATE_ELEM syscall cmd will map to 'push' ops
> and new BPF_MAP_LOOKUP_AND_DELETE_ELEM will map to 'pop' ops.
> right?
>
>
That's right.
While updating the push api some came to me, do we have any specific
reason to support only 1, 2, 4, 8 bytes? I think we could do it general
enough to support any number of bytes, if the user is worried about the
cost of memcpys he could use a map of 8 bytes pointers as you mentioned
some time ago.
From an API point of view, pop/peek helpers already expect a void
*value (int bpf_map_[pop, peek]_elem(map, void *value)), the only change
would also to use a pointer in the push instead of a u64.
Powered by blists - more mailing lists