[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <415993ef-0238-4fc0-a2e5-acb938ec2b10@gmail.com>
Date: Fri, 13 Jun 2025 17:12:50 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Andrii Nakryiko <andrii@...nel.org>
Cc: io-uring@...r.kernel.org, Martin KaFai Lau <martin.lau@...ux.dev>,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC v2 5/5] io_uring/bpf: add basic kfunc helpers
On 6/13/25 01:25, Alexei Starovoitov wrote:
> On Thu, Jun 12, 2025 at 6:25 AM Pavel Begunkov <asml.silence@...il.com> wrote:
...>>>> +BTF_ID_FLAGS(func, bpf_io_uring_extract_next_cqe, KF_RET_NULL);
>>>> +BTF_KFUNCS_END(io_uring_kfunc_set)
>>>
>>> This is not safe in general.
>>> The verifier doesn't enforce argument safety here.
>>> As a minimum you need to add KF_TRUSTED_ARGS flag to all kfunc.
>>> And once you do that you'll see that the verifier
>>> doesn't recognize the cqe returned from bpf_io_uring_get_cqe*()
>>> as trusted.
>>
>> Thanks, will add it. If I read it right, without the flag the
>> program can, for example, create a struct io_ring_ctx on stack,
>> fill it with nonsense and pass to kfuncs. Is that right?
>
> No. The verifier will only allow a pointer to struct io_ring_ctx
> to be passed, but it may not be fully trusted.
>
> The verifier has 3 types of pointers to kernel structures:
> 1. ptr_to_btf_id
> 2. ptr_to_btf_id | trusted
> 3. ptr_to_btf_id | untrusted
>
> 1st was added long ago for tracing and gradually got adopted
> for non-tracing needs, but it has a foot gun, since
> all pointer walks keep ptr_to_btf_id type.
> It's fine in some cases to follow pointers, but not in all.
> Hence 2nd variant was added and there
> foo->bar dereference needs to be explicitly allowed
> instead of allowed by default like for 1st kind.
>
> All loads through 1 and 3 are implemented as probe_read_kernel.
> while loads from 2 are direct loads.
>
> So kfuncs without KF_TRUSTED_ARGS with struct io_ring_ctx *ctx
> argument are likely fine and safe, since it's impossible
> to get this io_ring_ctx pointer by dereferencing some other pointer.
> But better to tighten safety from the start.
> We recommend KF_TRUSTED_ARGS for all kfuncs and
> eventually it will be the default.
Sure, I'll add it, thanks for the explanation
...>> diff --git a/io_uring/bpf.c b/io_uring/bpf.c
>> index 9494e4289605..400a06a74b5d 100644
>> --- a/io_uring/bpf.c
>> +++ b/io_uring/bpf.c
>> @@ -2,6 +2,7 @@
>> #include <linux/bpf_verifier.h>
>>
>> #include "io_uring.h"
>> +#include "memmap.h"
>> #include "bpf.h"
>> #include "register.h"
>>
>> @@ -72,6 +73,14 @@ struct io_uring_cqe *bpf_io_uring_extract_next_cqe(struct io_ring_ctx *ctx)
>> return cqe;
>> }
>>
>> +__bpf_kfunc
>> +void *bpf_io_uring_get_region(struct io_ring_ctx *ctx, u64 size__retsz)
>> +{
>> + if (size__retsz > ((u64)ctx->ring_region.nr_pages << PAGE_SHIFT))
>> + return NULL;
>> + return io_region_get_ptr(&ctx->ring_region);
>> +}
>
> and bpf prog should be able to read/write anything in
> [ctx->ring_region->ptr, ..ptr + size] region ?
Right, and it's already rw mmap'ed into the user space.
> Populating (creating) dynptr is probably better.
> See bpf_dynptr_from*()
>
> but what is the lifetime of that memory ?
It's valid within a single run of the callback but shouldn't cross
into another invocation. Specifically, it's protected by the lock,
but that can be tuned. Does that match with what PTR_TO_MEM expects?
I can add refcounting for longer term pinning, maybe to store it
as a bpf map or whatever is the right way, but I'd rather avoid
anything expensive in the kfunc as that'll likely be called on
every program run.
--
Pavel Begunkov
Powered by blists - more mailing lists