[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180807144226.pmalhe3mjvc3a45y@ast-mbp.dhcp.thefacebook.com>
Date: Tue, 7 Aug 2018 07:42:28 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Mauricio Vasquez B <mauricio.vasquez@...ito.it>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next 1/3] bpf: add bpf queue map
On Mon, Aug 06, 2018 at 03:58:30PM +0200, Mauricio Vasquez B wrote:
> Bpf queue implements a LIFO/FIFO data containers for ebpf programs.
queue/stack datastructure would be a great addition.
> It allows to push an element to the queue by using the update operation
> and to pop an element from the queue by using the lookup operation.
>
> A use case for this is to keep track of a pool of elements, like
> network ports in a SNAT.
>
> Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@...ito.it>
> ---
> include/linux/bpf_types.h | 1
> include/uapi/linux/bpf.h | 5 +
> kernel/bpf/Makefile | 2
> kernel/bpf/queuemap.c | 287 +++++++++++++++++++++++++++++++++++++++++++++
> kernel/bpf/syscall.c | 61 +++++++---
> kernel/bpf/verifier.c | 16 ++-
> 6 files changed, 353 insertions(+), 19 deletions(-)
> create mode 100644 kernel/bpf/queuemap.c
>
> diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
> index c5700c2d5549..6c7a62f3fe43 100644
> --- a/include/linux/bpf_types.h
> +++ b/include/linux/bpf_types.h
> @@ -58,3 +58,4 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_CPUMAP, cpu_map_ops)
> BPF_MAP_TYPE(BPF_MAP_TYPE_XSKMAP, xsk_map_ops)
> #endif
> #endif
> +BPF_MAP_TYPE(BPF_MAP_TYPE_QUEUE, queue_map_ops)
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 0ebaaf7f3568..2c171c40eb45 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -120,6 +120,7 @@ enum bpf_map_type {
> BPF_MAP_TYPE_CPUMAP,
> BPF_MAP_TYPE_XSKMAP,
> BPF_MAP_TYPE_SOCKHASH,
> + BPF_MAP_TYPE_QUEUE,
> };
>
> enum bpf_prog_type {
> @@ -255,6 +256,10 @@ enum bpf_attach_type {
> /* Flag for stack_map, store build_id+offset instead of pointer */
> #define BPF_F_STACK_BUILD_ID (1U << 5)
>
> +/* Flags for queue_map, type of queue */
> +#define BPF_F_QUEUE_FIFO (1U << 16)
> +#define BPF_F_QUEUE_LIFO (2U << 16)
the choice of flags looks odd.
Why start at bit 16 and why waste two bits?
It's either stack or queue.
May be instead define two map_types:
BPF_MAP_TYPE_QUEUE and BPF_MAP_TYPE_STACK
that share common implementation?
And how about adding three new helpers: push/pop/peek as well?
Reusing lookup/update is neat, but does lookup == pop
or does lookup == peek ?
I suspect it will be confusing.
Three new helpers cost nothing, but will make bpf progs easier to read.
Could you also add a flag for replacement policy?
In this implementation when max_entries limit is reached
the map_update_elem (aka push) will fail with e2big.
It would be useful to allow pushing and dropping elements from
the other side then such queue/stack can be used to keep
track of the last N events (the prog would unconditionally push
and user space would swap the queue for new one via map-in-map
and drain old queue).
Speaking of map-in-map, please add a test to make sure
queue/stack works with hash/array_of_maps.
selftests in patch2 needs to have kernel side as well.
it's not enough to test syscall access only.
Powered by blists - more mailing lists