netdev - Re: [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAADnVQ+z75P0sryoGhgUwrHRMr2Jw=eFO4eCRe0Ume554si9Zg@mail.gmail.com>
Date:   Sat, 19 Feb 2022 10:44:29 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Hou Tao <houtao1@...wei.com>
Cc:     Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Song Liu <songliubraving@...com>,
        John Fastabend <john.fastabend@...il.com>,
        Network Development <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Joanne Koong <joannekoong@...com>
Subject: Re: [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab

On Fri, Feb 18, 2022 at 5:54 AM Hou Tao <houtao1@...wei.com> wrote:
>
> > We've been thinking about "dynamic pointer" concept where
> > pointer + length will be represented as an object.
> I have seen the proposal from Joanne Koong on "dynamic pointers".
> It can solve the storage problem of string, but the lookup of
> string is still a problem. Hash is a option but can we support
> two dynamic pointers points to the same internal object and use
> the id of the internal object to represent the string ?

Let's have a discussion about dynptr in that thread.

> > The program will be able to allocate it and persist into a map value
> > and potentially into a map key.
> > For storing a bunch of strings one can use a strong hash and store
> > that hash in the key while keeping full string as a variable sized
> > object inside the value.
> Will using a strong hash function impact the lookup performance because
> each lookup will need to calculate the hash ?
>
> > Another approach is to introduce a trie to store strings, or dfa,
> > or aho-corasick, etc. There are plenty of data structures that are
> > more suitable for storing and searching strings depending on the use case.
> > Using hash table for strings has its downsides.
> > .
> Before add support for string key in hash table, we had implement tries,
> ternary search tree and hash table in user-space for string lookup. hash
> table shows better performance and memory usage, so we decide to do string
> support in hash table. We will revisit our tests and investigate new string
> data structures.

What performance characteristics did you consider?
Just the speed of the lookup ?
How many elements are there in the map ?
Is it 80% or 10% full ?
Did you compare memory usage ?
Did you consider the speed of update/delete ?
preallocated or dynamic alloc ?

With dynamic pointers and further verifier improvements bpf programs
will be able to implement a hash table on their own.