[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211220030013.4jsnm367ckl5ksi5@ast-mbp.dhcp.thefacebook.com>
Date: Sun, 19 Dec 2021 19:00:13 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Hou Tao <houtao1@...wei.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>,
Song Liu <songliubraving@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, netdev@...r.kernel.org,
bpf@...r.kernel.org, yunbo.xufeng@...ux.alibaba.com
Subject: Re: [RFC PATCH bpf-next 0/3] support string key for hash-table
On Sun, Dec 19, 2021 at 01:22:42PM +0800, Hou Tao wrote:
> Hi,
>
> In order to use string as hash-table key, key_size must be the storage
> size of longest string. If there are large differencies in string
> length, the hash distribution will be sub-optimal due to the unused
> zero bytes in shorter strings and the lookup will be inefficient due to
> unnecessary memcpy().
>
> Also it is possible the unused part of string key returned from bpf helper
> (e.g. bpf_d_path) is not mem-zeroed and if using it directly as lookup key,
> the lookup will fail with -ENOENT (as reported in [1]).
>
> The patchset tries to address the inefficiency by adding support for
> string key. During the key comparison, the string length is checked
> first to reduce the uunecessary memcmp. Also update the hash function
> from jhash() to full_name_hash() to reduce hash collision of string key.
>
> There are about 16% and 106% improvment in benchmark under x86-64 and
> arm64 when key_size is 256. About 45% and %161 when key size is greater
> than 1024.
>
> Also testing the performance improvment by using all files under linux
> kernel sources as the string key input. There are about 74k files and the
> maximum string length is 101. When key_size is 104, there are about 9%
> and 35% win under x86-64 and arm64 in lookup performance, and when key_size
> is 256, the win increases to 78% and 109% respectively.
>
> Beside the optimization of lookup for string key, it seems that the
> allocated space for BPF_F_NO_PREALLOC-case can also be optimized. More
> trials and tests will be conducted if the idea of string key is accepted.
It will work when the key is a string. Sooner or later somebody would need
the key to be a string and few other integers or pointers.
This approach will not be usable.
Much worse, this approach will be impossible to extend.
Have you considered a more generic string support?
Make null terminated string to be a fist class citizen.
wdyt?
Powered by blists - more mailing lists