[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZi6A1vcFUFdwSQrbag_ptoU9+imdWhHdVKCLB8yvPSTw@mail.gmail.com>
Date: Wed, 22 May 2019 15:13:53 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Stanislav Fomichev <sdf@...ichev.me>
Cc: Andrii Nakryiko <andriin@...com>, Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Networking <netdev@...r.kernel.org>, bpf@...r.kernel.org,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 05/12] libbpf: add resizable non-thread safe
internal hashmap
On Wed, May 22, 2019 at 1:30 PM Stanislav Fomichev <sdf@...ichev.me> wrote:
>
> On 05/22, Andrii Nakryiko wrote:
> > There is a need for fast point lookups inside libbpf for multiple use
> > cases (e.g., name resolution for BTF-to-C conversion, by-name lookups in
> > BTF for upcoming BPF CO-RE relocation support, etc). This patch
> > implements simple resizable non-thread safe hashmap using single linked
> > list chains.
> Didn't really look into the details, but any reason you're not using
> linux/hashtable.h? It's exported in tools/include and I think perf
> is using it. It's probably not resizable, but should be easy to
> implement rebalancing on top of it.
There are multiple reasons.
1. linux/hashtable.h is pretty bare-bones, it's just hlist_node and a
bunch of macro to manipulate array or chains of them. I wanted to have
higher-level API with lookup by key, insertion w/ various strategies,
etc. Preferrably one not requiring to manipulate hlist_node directly
as part of its API, even if at some performance cost of hiding that
low-level detail.
2. Resizing is a big chunk of resizable hashmap logic, so I'd need to
write a bunch of additional code anyway.
3. Licensing. linux/hashtable.h is under GPL, while libbpf is
dual-licensed under GPL and BSD. When syncing libbpf from kernel to
github, we have to re-implement all the parts from kernel that are not
under BSD license anyway.
4. hlist_node keeps two pointers per item, which is unnecessary for
hashmap which does deletion by key (by searching for node first, then
deleting), so we can also have lower memory overhead per entry.
So in general, I feel like there is little benefit to reusing
linux/hashlist.h for use cases I'm targeting this hashmap for.
>
> > Four different insert strategies are supported:
> > - HASHMAP_ADD - only add key/value if key doesn't exist yet;
> > - HASHMAP_SET - add key/value pair if key doesn't exist yet; otherwise,
> > update value;
> > - HASHMAP_UPDATE - update value, if key already exists; otherwise, do
> > nothing and return -ENOENT;
> > - HASHMAP_APPEND - always add key/value pair, even if key already exists.
> > This turns hashmap into a multimap by allowing multiple values to be
> > associated with the same key. Most useful read API for such hashmap is
> > hashmap__for_each_key_entry() iteration. If hashmap__find() is still
> > used, it will return last inserted key/value entry (first in a bucket
> > chain).
> >
> > For HASHMAP_SET and HASHMAP_UPDATE, old key/value pair is returned, so
> > that calling code can handle proper memory management, if necessary.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@...com>
> > ---
> > tools/lib/bpf/Build | 2 +-
> > tools/lib/bpf/hashmap.c | 229 ++++++++++++++++++++++++++++++++++++++++
> > tools/lib/bpf/hashmap.h | 173 ++++++++++++++++++++++++++++++
> > 3 files changed, 403 insertions(+), 1 deletion(-)
> > create mode 100644 tools/lib/bpf/hashmap.c
> > create mode 100644 tools/lib/bpf/hashmap.h
> >
> >
Powered by blists - more mailing lists