lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d71f2351-af17-7e20-7f99-d628b7ab5765@bytedance.com>
Date:   Thu, 26 May 2022 10:40:35 +0800
From:   Feng Zhou <zhoufeng.zf@...edance.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
        Xiongchun Duan <duanxiongchun@...edance.com>,
        Muchun Song <songmuchun@...edance.com>,
        Dongdong Wang <wangdongdong.6@...edance.com>,
        Cong Wang <cong.wang@...edance.com>,
        Chengming Zhou <zhouchengming@...edance.com>
Subject: Re: [External] Re: [PATCH v2 2/2] selftest/bpf/benchs: Add bpf_map
 benchmark

在 2022/5/25 上午8:13, Alexei Starovoitov 写道:
> On Tue, May 24, 2022 at 12:53 AM Feng zhou <zhoufeng.zf@...edance.com> wrote:
>> +static void setup(void)
>> +{
>> +       struct bpf_link *link;
>> +       int map_fd, i, max_entries;
>> +
>> +       setup_libbpf();
>> +
>> +       ctx.skel = bpf_map_bench__open_and_load();
>> +       if (!ctx.skel) {
>> +               fprintf(stderr, "failed to open skeleton\n");
>> +               exit(1);
>> +       }
>> +
>> +       link = bpf_program__attach(ctx.skel->progs.benchmark);
>> +       if (!link) {
>> +               fprintf(stderr, "failed to attach program!\n");
>> +               exit(1);
>> +       }
>> +
>> +       //fill hash_map
>> +       map_fd = bpf_map__fd(ctx.skel->maps.hash_map_bench);
>> +       max_entries = bpf_map__max_entries(ctx.skel->maps.hash_map_bench);
>> +       for (i = 0; i < max_entries; i++)
>> +               bpf_map_update_elem(map_fd, &i, &i, BPF_ANY);
>> +}
> ...
>   +SEC("fentry/" SYS_PREFIX "sys_getpgid")
>> +int benchmark(void *ctx)
>> +{
>> +       u32 key = bpf_get_prandom_u32();
>> +       u64 init_val = 1;
>> +
>> +       bpf_map_update_elem(&hash_map_bench, &key, &init_val, BPF_ANY);
>> +       return 0;
>> +}
> This benchmark is artificial at its extreme.
> First it populates the map till max_entries and then
> constantly bounces off the max_entries limit in a bpf prog.
> Sometimes random_u32 will be less than max_entries
> and map_update_elem will hit a fast path,
> but most of the time it will fail to alloc_htab_elem()
> and will fail to map_update_elem.
>
> It does demonstrate that percpu_free_list is inefficient
> when it's empty, but there is no way such a microbenchmark
> justifies optimizing this corner case.
>
> If there is a production use case please code it up in
> a benchmark.

This corner case is not easy to reproduce. In the scenario of a surge in 
network traffic,
the map is full, and there are still a large number of update operations.
Just like Yonghong Song says
'''
in your use case, you have lots of different keys and your intention is 
NOT to capture all the keys in
the hash table. So given a hash table, it is possible that the hash
will become full even if you increase the hashtable size.
Maybe you will occasionally delete some keys which will free some
space but the space will be quickly occupied by the new updates.
'''

>
> Also there is a lot of other overhead: syscall and atomic-s.
> To stress map_update_elem please use a for() loop inside bpf prog.

Ok, I will modify the way the test case is tested.
And add this benchmark just to reproduce the case. As for whether to 
optimize this case, I use ftrace
'''
cd /sys/kernel/debug/tracing/
echo > trace
echo htab_map_update_elem > set_graph_function
echo function_graph > current_tracer
cat per_cpu/cpu0/trace
echo nop > current_tracer
'''
To confirm whether the update operation will continue to grab the 
spin-lock of each cpu when the map is full.
Then close ftrace, check the time-consuming update and whether there is 
any improvement before patching.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ