lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 5 Feb 2024 15:24:43 -0800
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Marco Elver <elver@...gle.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
 Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>,
 John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
 Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>,
 Jiri Olsa <jolsa@...nel.org>, Mykola Lysenko <mykolal@...com>,
 Shuah Khan <shuah@...nel.org>, bpf@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH] bpf: Separate bpf_local_storage_lookup() fast and slow
 paths

On 2/5/24 7:00 AM, Marco Elver wrote:
> On Wed, 31 Jan 2024 at 20:52, Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> [...]
>>> | num_maps: 1000
>>> |  local_storage cache sequential  get:
>>> |                              <before>                | <after>
>>> |   hits throughput:           0.357 ± 0.005 M ops/s   | 0.325 ± 0.005 M ops/s        (-9.0%)
>>> |   hits latency:              2803.738 ns/op          | 3076.923 ns/op               (+9.7%)
>>
>> Is it understood why the slow down here? The same goes for the "num_maps: 32"
>> case above but not as bad as here.
> 
> It turned out that there's a real slowdown due to the outlined
> slowpath. If I inline everything except for inserting the entry into
> the cache (cacheit_lockit codepath is still outlined), the results
> look much better even for the case where it always misses the cache.
> 
> [...]
>>> diff --git a/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c b/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
>>> index a043d8fefdac..9895087a9235 100644
>>> --- a/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
>>> +++ b/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
>>> @@ -21,7 +21,7 @@ struct {
>>>        __type(value, long);
>>>    } map_b SEC(".maps");
>>>
>>> -SEC("fentry/bpf_local_storage_lookup")
>>> +SEC("fentry/bpf_local_storage_lookup_slowpath")
>>
>> The selftest is trying to catch recursion. The change here cannot test the same
>> thing because the slowpath will never be hit in the test_progs.  I don't have a
>> better idea for now also.
> 
> Trying to prepare a v2, and for the test, the only option I see is to
> introduce a tracepoint ("bpf_local_storage_lookup"). If unused, should
> be a no-op due to static branch.
> 
> Or can you suggest different functions to hook to for the recursion test?

I don't prefer to add another tracepoint for the selftest.

The test in "SEC("fentry/bpf_local_storage_lookup")" is testing that the initial 
bpf_local_storage_lookup() should work and the immediate recurred 
bpf_task_storage_delete() will fail.

Depends on how the new slow path function will look like in v2. The test can 
probably be made to go through the slow path, e.g. by creating a lot of task 
storage maps before triggering the lookup.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ