lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8f473f7-57ec-3161-e634-fc2e6925ec3d@huawei.com>
Date:   Mon, 8 Nov 2021 22:05:33 +0800
From:   Hou Tao <houtao1@...wei.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     Alexei Starovoitov <ast@...nel.org>,
        Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>,
        Song Liu <songliubraving@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>, <netdev@...r.kernel.org>,
        <bpf@...r.kernel.org>
Subject: Re: [RFC PATCH bpf-next 2/2] selftests/bpf: add benchmark bpf_strcmp

HI,

On 11/7/2021 2:43 AM, Alexei Starovoitov wrote:
> On Sat, Nov 06, 2021 at 09:28:22PM +0800, Hou Tao wrote:
>> The benchmark runs a loop 5000 times. In the loop it reads the file name
>> from kprobe argument into stack by using bpf_probe_read_kernel_str(),
>> and compares the file name with a target character or string.
>>
>> Three cases are compared: only compare one character, compare the whole
>> string by a home-made strncmp() and compare the whole string by
>> bpf_strcmp().
>>
>> The following is the result:
>>
>> x86-64 host:
>>
>> one character: 2613499 ns
>> whole str by strncmp: 2920348 ns
>> whole str by helper: 2779332 ns
>>
>> arm64 host:
>>
>> one character: 3898867 ns
>> whole str by strncmp: 4396787 ns
>> whole str by helper: 3968113 ns
>>
>> Compared with home-made strncmp, the performance of bpf_strncmp helper
>> improves 80% under x86-64 and 600% under arm64. The big performance win
>> on arm64 may comes from its arch-optimized strncmp().
> 80% and 600% improvement?!
> I don't understand how this math works.
> Why one char is barely different in total nsec than the whole string?
> The string shouldn't miscompare on the first char as far as I understand the test.
Because the result of "one character" includes the overhead of process filtering and
string read.
My bad, I should explain the tests results in more details.

Three tests are exercised:

(1) one character
Filter unexpected caller by bpf_get_current_pid_tgid()
Use bpf_probe_read_kernel_str() to read the file name into 64-bytes sized-buffer
in stack
Only compare the first character of file name

(2) whole str by strncmp
Filter unexpected caller by bpf_get_current_pid_tgid()
Use bpf_probe_read_kernel_str() to read the file name into 64-bytes sized-buffer
in stack
Compare by using home-made strncmp(): the compared two strings are the same, so
the whole string is compared

(3) whole str by helper
Filter unexpected caller by bpf_get_current_pid_tgid()
Use bpf_probe_read_kernel_str() to read the file name into 64-bytes sized-buffer
in stack
Compare by using bpf_strncmp: the compared two strings are the same, so
the whole string is compared

Now "(1) one character" is used to calculate the overhead of process filtering and
string read. So under x86-64, the overhead of strncmp() is

  total time of whole str by strncmp  test  - total time of no character test =
306849 ns.

The overhead of bpf_strncmp() is:
  total time of whole str by helper test - total time of no character test =
165833 ns

So the performance win is about (306849  / 165833 ) * 100 - 100 = ~85%

And the win under arm64 is about (497920 / 69246) * 100 - 100 = ~600%

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ