[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SY4P282MB23134B64BA6B0AC71A27BF37C6EF2@SY4P282MB2313.AUSP282.PROD.OUTLOOK.COM>
Date: Tue, 28 Jan 2025 19:22:37 +0800
From: Levi Zim <rsworktech@...look.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Andrei Matei <andreimatei1@...il.com>, Jordan Rome
<linux@...danrome.com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman
<eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Matt Bobrowski <mattbobrowski@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu
<mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
linux-trace-kernel <linux-trace-kernel@...r.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2 1/7] bpf: Implement
bpf_probe_read_kernel_dynptr helper
On 2025/1/28 10:57, Alexei Starovoitov wrote:
> On Mon, Jan 27, 2025 at 3:09 PM Andrii Nakryiko
> <andrii.nakryiko@...il.com> wrote:
>> On Mon, Jan 27, 2025 at 2:54 PM Andrei Matei <andreimatei1@...il.com> wrote:
>>> On Mon, Jan 27, 2025 at 5:04 PM Alexei Starovoitov
>>> <alexei.starovoitov@...il.com> wrote:
>>>> On Sat, Jan 25, 2025 at 5:05 PM Levi Zim <rsworktech@...look.com> wrote:
>>>>> On 2025/1/26 00:58, Alexei Starovoitov wrote:
>>>>> > On Sat, Jan 25, 2025 at 12:30 AM Levi Zim via B4 Relay
>>>>> > <devnull+rsworktech.outlook.com@...nel.org> wrote:
>>>>> >> From: Levi Zim <rsworktech@...look.com>
>>>>> >>
>>>>> >> This patch add a helper function bpf_probe_read_kernel_dynptr:
>>>>> >>
>>>>> >> long bpf_probe_read_kernel_dynptr(const struct bpf_dynptr *dst,
>>>>> >> u32 offset, u32 size, const void *unsafe_ptr, u64 flags);
>>>>> > We stopped adding helpers years ago.
>>>>> > Only new kfuncs are allowed.
>>>>>
>>>>> Sorry, I didn't know that. Just asking, is there any
>>>>> documentation/discussion
>>>>> about stopping adding helpers?
>>>>>
>>>>> I will switch the implementation to kfuncs in v3.
>>>>>
>>>>> > This particular one doesn't look useful as-is.
>>>>> > The same logic can be expressed with
>>>>> > - create dynptr
>>>>> > - dynptr_slice
>>>>> > - copy_from_kernel
>>>>>
>>>>> By copy_from_kernel I assume you mean bpf_probe_read_kernel. The problem
>>>>> with dynptr_slice_rdwr and probe_read_kernel is that they only support a
>>>>> compile-time constant size [1].
>>>>>
>>>>> But in order to best utilize the space on a BPF ringbuf, it is possible
>>>>> to reserve a
>>>>> variable length of space as dynptr on a ringbuf with
>>>>> bpf_ringbuf_reserve_dynptr.
>>> For our uprobes, we've run into similar issues around doing variable-sized
>>> bpf_probe_read_user() into ring buffers for our debugger [1]. Our use case
>>> is that we generate uprobes that recursively read data structures until we
>>> fill up a buffer. The verifier's insistence on knowing statically that a read
>>> fits into the buffer makes for awkward code, and makes it hard to pack the
>>> buffer fully; we have to split our reads into a couple of static size classes.
>>>
>>> Any chance there'd be interest in taking the opportunity to support
>>> dynamically-sized reads from userspace too? :)
>> That's bpf_probe_read_user_dynptr() from patch #2, no?
>>
>> But generally speaking, here's a list of new APIs that we'd need to
>> cover all existing fixed buffer versions:
>>
>> - non-sleepable probe reads:
>>
>> bpf_probe_read_kernel_dynptr()
>> bpf_probe_read_user_dynptr()
>> bpf_probe_read_kernel_str_dynptr()
>> bpf_probe_read_user_str_dynptr()
>>
>> - sleepable probe reads (copy_from_user):
>>
>> bpf_copy_from_user_dynptr()
>> bpf_copy_from_user_str_dynptr()
>>
>> - and then we have complementary task-based APIs for non-current process:
>>
>> bpf_probe_read_user_task_dynptr()
>> bpf_probe_read_user_str_task_dynptr()
>> bpf_copy_from_user_task_dynptr()
>> bpf_copy_from_user_str_task_dynptr()
>>
>> Jordan is working on non-dynptr version of
>> bpf_copy_from_user_str_task(), once he's done with that, we'll add
>> dynptr version, probably.
> This is quite a bunch of kfuncs.
> It doesn't look like adding _dynptr suffix and duplicating
> kfuncs approach scales.
The _str_dynptr versions might not worth adding [1].
So only four read_{kernel,user}_dynptr and copy_from_user{,_task}_dynptr
are needed,
which seems manageable for now.
But taking other helpers like bpf_strtol into account does quickly show
that this approach
is not scalable.
> Let's make the existing helpers/kfuncs more flexible ?
>
> We can introduce a kfunc bpf_dynptr_buf() that checks that
> dynptr is not readonly and type == local or ringbuf and
> return dynptr->data as PTR_TO_MEM | dynptr_flag | VERIFIER_ADDS_SIZE_CHECK.
>
> Then allow bpf_probe_read_user/kernel/... all of them to accept
> this register type where PTR_TO_MEM is required
> while relaxing ARG_CONST_SIZE 2nd argument to ARG_ANYTHING.
> Then the verifier will insert an extra check
> if (arg1->size < arg2)
> before the call.
Nice idea. I will try this approach first.
>
> Not only the bpf_probe_read_kernel/user, _str variants will work
> but things like bpf_strtol, bpf_strncmp, bpf_snprintf, bpf_get_stack
> will auto-magically work as well.
>
> I think those are quite valuable to make available with non-constant size.
> bpf_get_stack_*() directly into the ring buffer sounds very useful.
[1]:
https://lore.kernel.org/bpf/20250125-bpf_dynptr_probe-v2-0-c42c87f97afe@outlook.com/T/#m9700146d286a88abc0b25ef47041015ba6c477a3
Powered by blists - more mailing lists