[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQLJ=nxp3bZYYMJd0yrUtMNx2DcvYXXmbGKBQAiG85kSLQ@mail.gmail.com>
Date: Mon, 6 May 2024 17:26:38 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Benjamin Tissoires <bentiss@...nel.org>
Cc: Daniel Borkmann <daniel@...earbox.net>, Alexei Starovoitov <ast@...nel.org>,
John Fastabend <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] bpf: verifier: allow arrays of progs to be used in
sleepable context
On Tue, Apr 30, 2024 at 3:03 AM Benjamin Tissoires <bentiss@...nel.org> wrote:
>
>
> Right now, what I am doing is (in simplified pseudo code):
> - in a bpf program, the user calls hid_bpf_attach_prog(hid_device, program_fd)
> where program fd is a tracing program on a never executed function
> but this allows to know the type of program to run
> - the kernel stores that program into a dedicated prog array bpf_map
> pre-loaded at boot time
> - when a event comes in, the kernel walks through the list of attached
> programs, calls __hid_bpf_tail_call() and there is a tracing program
> attached to it that just do the bpf_tail_call.
>
> This works and is simple enough from the user point of view, but is
> rather inefficient and clunky from the kernel point of view IMO.
>
> The freplace mechnism would definitely work if I had a tracing-like
> function to call, where I need to run the program any time the function
> gets called. But given that I want per-device filtering, I'm not sure
> how I could make this work. But given that I need to enable or not the
> bpf_program, I'm not sure how I could make it work from the kernel point
> of view.
>
> I tried using a simple bpf_prog_run() (which is exactly what I need in
> the end) but I couldn't really convince the bpf verifier that the
> provided context is a struct hid_bpf_ctx kernel pointer, and it felt not
> quite right.
>
> So after seeing how the bpf_wq worked internally, and how simple it is
> now to call a bpf program from the kernel as a simple function call, I
> played around with allowing kfunc to declare async callback functions.
>
> I have a working prototype (probably not fully functional for all of the
> cases), but I would like to know if you think it would be interesting to
> have 3 new suffixes:
> - "__async" for declaring an static bpf program that can be stored in
> the kernel and which would be non sleepable
> - "__s_async" same as before, but for sleepable operations
> - "__aux" (or "__prog_aux") for that extra parameter to
> bpf_wq_set_callback_impl() which contains the struct bpf_prog*.
Sorry for the delay. I've been traveling.
I don't quite understand how these suffixes will work.
You mean arguments of kfuncs to tell kfunc that an argument
is a pointer to async callback?
Sort-of generalization of is_async_callback_calling_kfunc() ?
I cannot connect the dots.
Feels dangerous to open up bpf prog calling to arbitrary kfuncs.
wq/timer/others are calling progs with:
callback_fn((u64)(long)map, (u64)(long)key, (u64)(long)value, 0, 0);
I feel we'll struggle to code review kfuncs that do such things.
Plus all prog life time management concerns.
wq/timer do careful bpf_prog_put/get dance.
> (I still don't have the __aux yet FWIW)
>
> The way I'm doing it is looking at the btf information to fetch the
> signature of the parameters of the callback, this way we can declare any
> callback without having to teach the verifier of is arguments (5 max).
>
> Is this something you would be comfortable with or is there a simpler
> mechanism already in place to call the bpf programs from the kernel
> without the ctx limitations?
"ctx limitations" you mean 5 args?
Have you looked at struct_ops ?
It can have any number of args.
Maybe declare struct_ops in hid and let user space provide struct_ops progs?
> I can also easily switch the bpf_wq specific cases in the verifier with
> those suffixes. There are still one or two wq specifics I haven't
> implemented through __s_async, but that would still makes things more
> generic.
>
> Cheers,
> Benjamin
Powered by blists - more mailing lists