[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20181003211052.2cofqi7jfbmdleoy@ast-mbp.dhcp.thefacebook.com>
Date: Wed, 3 Oct 2018 14:10:54 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Mauricio Vasquez <mauricio.vasquez@...ito.it>
Cc: daniel@...earbox.net, netdev@...r.kernel.org
Subject: Re: [RFC PATCH bpf-next v3 4/7] bpf: add bpf queue and stack maps
On Wed, Oct 03, 2018 at 12:01:37PM -0500, Mauricio Vasquez wrote:
>
>
> On 10/01/2018 07:26 PM, Alexei Starovoitov wrote:
> > On Mon, Oct 01, 2018 at 08:11:43AM -0500, Mauricio Vasquez wrote:
> > > > > > +BPF_CALL_3(bpf_map_pop_elem, struct bpf_map *, map, void *,
> > > > > > value, u32, size)
> > > > > > +{
> > > > > > + void *ptr;
> > > > > > +
> > > > > > + if (map->value_size != size)
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + ptr = map->ops->map_lookup_and_delete_elem(map, NULL);
> > > > > > + if (!ptr)
> > > > > > + return -ENOENT;
> > > > > > +
> > > > > > + switch (size) {
> > > > > > + case 1:
> > > > > > + *(u8 *) value = *(u8 *) ptr;
> > > > > > + break;
> > > > > > + case 2:
> > > > > > + *(u16 *) value = *(u16 *) ptr;
> > > > > > + break;
> > > > > > + case 4:
> > > > > > + *(u32 *) value = *(u32 *) ptr;
> > > > > > + break;
> > > > > > + case 8:
> > > > > > + *(u64 *) value = *(u64 *) ptr;
> > > > > > + break;
> > > > > this is inefficient. can we pass value ptr into ops and let it
> > > > > populate it?
> > > > I don't think so, doing that implies that look_and_delete will be a
> > > > per-value op, while other ops in maps are per-reference.
> > > > For instance, how to change it in the case of peek helper that is using
> > > > the lookup operation?, we cannot change the signature of the lookup
> > > > operation.
> > > >
> > > > This is something that worries me a little bit, we are creating new
> > > > per-value helpers based on already existing per-reference operations,
> > > > this is not probably the cleanest way. Here we are at the beginning of
> > > > the discussion once again, how should we map helpers and syscalls to
> > > > ops.
> > > >
> > > > What about creating pop/peek/push ops, mapping helpers one to one and
> > > > adding some logic into syscall.c to call the correct operation in case
> > > > the map is stack/queue?
> > > > Syscall mapping would be:
> > > > bpf_map_lookup_elem() -> peek
> > > > bpf_map_lookup_and_delete_elem() -> pop
> > > > bpf_map_update_elem() -> push
> > > >
> > > > Does it make sense?
> > > Hello Alexei,
> > >
> > > Do you have any feedback on this specific part?
> > Indeed. It seems push/pop ops will be cleaner.
> > I still think that peek() is useless due to races.
> > So BPF_MAP_UPDATE_ELEM syscall cmd will map to 'push' ops
> > and new BPF_MAP_LOOKUP_AND_DELETE_ELEM will map to 'pop' ops.
> > right?
> >
> >
> That's right.
>
> While updating the push api some came to me, do we have any specific reason
> to support only 1, 2, 4, 8 bytes? I think we could do it general enough to
> support any number of bytes, if the user is worried about the cost of
> memcpys he could use a map of 8 bytes pointers as you mentioned some time
> ago.
> From an API point of view, pop/peek helpers already expect a void *value
> (int bpf_map_[pop, peek]_elem(map, void *value)), the only change would also
> to use a pointer in the push instead of a u64.
Indeed. Good idea. Full copy of the value for push and pop makes sense.
On the verifier side we probably need ARG_PTR_TO_UNINIT_MAP_VALUE
in addition to normal ARG_PTR_TO_MAP_VALUE for the case of bpf_map_pop().
Powered by blists - more mailing lists