[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190522223013.GC3032@mini-arch>
Date: Wed, 22 May 2019 15:30:13 -0700
From: Stanislav Fomichev <sdf@...ichev.me>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Andrii Nakryiko <andriin@...com>, Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Networking <netdev@...r.kernel.org>, bpf@...r.kernel.org,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 05/12] libbpf: add resizable non-thread safe
internal hashmap
On 05/22, Andrii Nakryiko wrote:
> On Wed, May 22, 2019 at 1:30 PM Stanislav Fomichev <sdf@...ichev.me> wrote:
> >
> > On 05/22, Andrii Nakryiko wrote:
> > > There is a need for fast point lookups inside libbpf for multiple use
> > > cases (e.g., name resolution for BTF-to-C conversion, by-name lookups in
> > > BTF for upcoming BPF CO-RE relocation support, etc). This patch
> > > implements simple resizable non-thread safe hashmap using single linked
> > > list chains.
> > Didn't really look into the details, but any reason you're not using
> > linux/hashtable.h? It's exported in tools/include and I think perf
> > is using it. It's probably not resizable, but should be easy to
> > implement rebalancing on top of it.
>
> There are multiple reasons.
> 1. linux/hashtable.h is pretty bare-bones, it's just hlist_node and a
> bunch of macro to manipulate array or chains of them. I wanted to have
> higher-level API with lookup by key, insertion w/ various strategies,
> etc. Preferrably one not requiring to manipulate hlist_node directly
> as part of its API, even if at some performance cost of hiding that
> low-level detail.
> 2. Resizing is a big chunk of resizable hashmap logic, so I'd need to
> write a bunch of additional code anyway.
> 3. Licensing. linux/hashtable.h is under GPL, while libbpf is
> dual-licensed under GPL and BSD. When syncing libbpf from kernel to
> github, we have to re-implement all the parts from kernel that are not
> under BSD license anyway.
> 4. hlist_node keeps two pointers per item, which is unnecessary for
> hashmap which does deletion by key (by searching for node first, then
> deleting), so we can also have lower memory overhead per entry.
>
> So in general, I feel like there is little benefit to reusing
> linux/hashlist.h for use cases I'm targeting this hashmap for.
Makes sense. Licensing is probably the biggest issue here because
my original suggestion was to use linux/hashtable.h internally,
just wrap it in a nice api.
But agree on all points, thanks for clarification!
> > > Four different insert strategies are supported:
> > > - HASHMAP_ADD - only add key/value if key doesn't exist yet;
> > > - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
> > > update value;
> > > - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
> > > nothing and return -ENOENT;
> > > - HASHMAP_APPEND - always add key/value pair, even if key already exists.
> > > This turns hashmap into a multimap by allowing multiple values to be
> > > associated with the same key. Most useful read API for such hashmap is
> > > hashmap__for_each_key_entry() iteration. If hashmap__find() is still
> > > used, it will return last inserted key/value entry (first in a bucket
> > > chain).
> > >
> > > For HASHMAP_SET and HASHMAP_UPDATE, old key/value pair is returned, so
> > > that calling code can handle proper memory management, if necessary.
> > >
> > > Signed-off-by: Andrii Nakryiko <andriin@...com>
> > > ---
> > > tools/lib/bpf/Build | 2 +-
> > > tools/lib/bpf/hashmap.c | 229 ++++++++++++++++++++++++++++++++++++++++
> > > tools/lib/bpf/hashmap.h | 173 ++++++++++++++++++++++++++++++
> > > 3 files changed, 403 insertions(+), 1 deletion(-)
> > > create mode 100644 tools/lib/bpf/hashmap.c
> > > create mode 100644 tools/lib/bpf/hashmap.h
> > >
> > >
Powered by blists - more mailing lists