lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220331122822.14283-1-houtao1@huawei.com>
Date:   Thu, 31 Mar 2022 20:28:20 +0800
From:   Hou Tao <houtao1@...wei.com>
To:     Alexei Starovoitov <ast@...nel.org>, Yonghong Song <yhs@...com>
CC:     Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Song Liu <songliubraving@...com>,
        KP Singh <kpsingh@...nel.org>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        <netdev@...r.kernel.org>, <bpf@...r.kernel.org>,
        <houtao1@...wei.com>
Subject: [RFC PATCH bpf-next 0/2] bpf: Introduce ternary search tree for string key

Hi,

The initial motivation for the patchset is due to the suggestion of Alexei.
During the discuss of supporting of string key in hash-table, he saw the
space efficiency of ternary search tree under our early test and suggest
us to post it as a new bpf map [1].

Ternary search tree is a special trie where nodes are arranged in a
manner similar to binary search tree, but with up to three children
rather than two. The three children correpond to nodes whose value is
less than, equal to, and greater than the value of current node
respectively.

In ternary search tree map, only the valid content of string is saved.
The trailing null byte and unused bytes after it are not saved. If there
are common prefixes between these strings, the prefix is only saved once.
Compared with other space optimized trie (e.g. HAT-trie, succinct trie),
the advantage of ternary search tree is simple and being writeable.

Below are diagrams for ternary search map when inserting hello, he,
test and tea into it:

1. insert "hello"

        [ hello ]

2. insert "he": need split "hello" into "he" and "llo"

         [ he ]
            |
            *
            |
         [ llo ]

3. insert "test": add it as right child of "he"

         [ he ]
            |
            *-------x
            |       |
         [ llo ] [ test ]

5. insert "tea": split "test" into "te" and "st",
   and insert "a" as left child of "st"

         [ he ]
            |
     x------*-------x
     |      |       |
  [ ah ] [ llo ] [ te ]
                    |
                    *
                    |
                 [ st ]
                    |
               x----*
               |
             [ a ]

As showed in above diagrams, the common prefix between "test" and "tea"
is "te" and it only is saved once. Also add benchmarks to compare the
memory usage and lookup performance between ternary search tree and
hash table. When the common prefix is lengthy (~192 bytes) and the
length of suffix is about 64 bytes, there are about 2~3 folds memory
saving compared with hash table. But the memory saving comes at prices:
the lookup performance of tst is about 2~3 slower compared with hash
table. See more benchmark details on patch #2.

Comments and suggestions are always welcome.

Regards,
Tao

[1]: https://lore.kernel.org/bpf/CAADnVQJUJp3YBcpESwR3Q1U6GS1mBM=Vp-qYuQX7eZOaoLjdUA@mail.gmail.com/

Hou Tao (2):
  bpf: Introduce ternary search tree for string key
  selftests/bpf: add benchmark for ternary search tree map

 include/linux/bpf_types.h                     |   1 +
 include/uapi/linux/bpf.h                      |   1 +
 kernel/bpf/Makefile                           |   1 +
 kernel/bpf/bpf_tst.c                          | 411 +++++++++++++++++
 tools/include/uapi/linux/bpf.h                |   1 +
 tools/testing/selftests/bpf/Makefile          |   5 +-
 tools/testing/selftests/bpf/bench.c           |   6 +
 .../selftests/bpf/benchs/bench_tst_map.c      | 415 ++++++++++++++++++
 .../selftests/bpf/benchs/run_bench_tst.sh     |  54 +++
 tools/testing/selftests/bpf/progs/tst_bench.c |  70 +++
 10 files changed, 964 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/bpf_tst.c
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_tst_map.c
 create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_tst.sh
 create mode 100644 tools/testing/selftests/bpf/progs/tst_bench.c

-- 
2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ