lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220217035041.axk46atz7j4svi2k@ast-mbp.dhcp.thefacebook.com>
Date:   Wed, 16 Feb 2022 19:50:41 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Hou Tao <houtao1@...wei.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Song Liu <songliubraving@...com>,
        John Fastabend <john.fastabend@...il.com>,
        netdev@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab

On Mon, Feb 14, 2022 at 07:13:34PM +0800, Hou Tao wrote:
> Hi,
> 
> In order to use string as hash-table key, key_size must be the storage
> size of longest string. If there are large differencies in string
> length, the hash distribution will be sub-optimal due to the unused
> zero bytes in shorter strings and the lookup will be inefficient due to
> unnecessary memcmp().
> 
> Also it is possible the unused part of string key returned from bpf helper
> (e.g. bpf_d_path) is not mem-zeroed and if using it directly as lookup key,
> the lookup will fail with -ENOENT (as reported in [1]).
> 
> The patchset tries to address the inefficiency by adding support for
> string key. There is extensibility problem in v1 because the string key
> and its optimization is only available for string-only key. To make it
> extensible, v2 introduces bpf_str_key_stor and bpf_str_key_desc and enforce
> the layout of hash key struct through BTF as follows:
> 
> 	>the start of hash key
> 	...
> 	[struct bpf_str_key_desc m;]
> 	...
> 	[struct bpf_str_key_desc n;]
> 	...
> 	struct bpf_str_key_stor z;
> 	unsigned char raw[N];
> 	>the end of hash key

Sorry, but this is dead end.
The v2 didn't fundamentally change the design.
The bpf_str_key_desc has an offset field, but it's unused.
The len field is dynamically checked at run-time and all hash maps
users will be paying that penalty though majority will not be
using this feature.
This patch set is incredibly specific solution to one task.
It's far from being generic. We cannot extend bpf this way.
All building blocks must be as generic as possible.
If you want strings as a key then the key must be variable size.
This patch doesn't make them so. It still requires some
predefined fixed size for largest string. This is no go.
Variable sized key means truly variable to megabytes long.
The key would need to contain an object (a pointer wrap) to
a variable sized object. And this object can be arbitrary
blob of bytes. Not just null terminated string.
We've been thinking about "dynamic pointer" concept where
pointer + length will be represented as an object.
The program will be able to allocate it and persist into a map value
and potentially into a map key.
For storing a bunch of strings one can use a strong hash and store
that hash in the key while keeping full string as a variable sized
object inside the value.
Another approach is to introduce a trie to store strings, or dfa,
or aho-corasick, etc. There are plenty of data structures that are
more suitable for storing and searching strings depending on the use case.
Using hash table for strings has its downsides.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ