[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240416-bpf_wq-v1-0-c9e66092f842@kernel.org>
Date: Tue, 16 Apr 2024 16:08:13 +0200
From: Benjamin Tissoires <bentiss@...nel.org>
To: Alexei Starovoitov <ast@...nel.org>, 
 Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, 
 Martin KaFai Lau <martin.lau@...ux.dev>, 
 Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>, 
 Yonghong Song <yonghong.song@...ux.dev>, 
 John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, 
 Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, 
 Jiri Olsa <jolsa@...nel.org>, Mykola Lysenko <mykolal@...com>, 
 Shuah Khan <shuah@...nel.org>
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org, 
 linux-kselftest@...r.kernel.org, Benjamin Tissoires <bentiss@...nel.org>
Subject: [PATCH bpf-next 00/18] Introduce bpf_wq
This is a followup of sleepable bpf_timer[0].
When discussing sleepable bpf_timer, it was thought that we should give
a try to bpf_wq, as the 2 APIs are similar but distinct enough to
justify a new one.
So here it is.
I tried to keep as much as possible common code in kernel/bpf/helpers.c
but I couldn't get away with code duplication in kernel/bpf/verifier.c.
This series introduces a basic bpf_wq support:
- creation is supported
- assignment is supported
- running a simple bpf_wq is also supported.
We will probably need to extend the API further with:
- a full delayed_work API (can be piggy backed on top with a correct
  flag)
- bpf_wq_cancel()
- bpf_wq_cancel_sync() (for sleepable programs)
- documentation
But for now, let's focus on what we currently have to see if it's worth
it compared to sleepable bpf_timer.
FWIW, I still have a couple of concerns with this implementation:
- I'm explicitely declaring the async callback as sleepable or not
  (BPF_F_WQ_SLEEPABLE) through a flag. Is it really worth it?
  Or should I just consider that any wq is running in a sleepable
  context?
- bpf_wq_work() access ->prog without protection, but I think this might
  be racing with bpf_wq_set_callback(): if we have the following:
  CPU 0                                     CPU 1
  bpf_wq_set_callback()
  bpf_start()
                                            bpf_wq_work():
                                              prog = cb->prog;
  bpf_wq_set_callback()
    cb->prog = prog;
    bpf_prog_put(prev)
    rcu_assign_ptr(cb->callback_fn,
                   callback_fn);
                                           callback = READ_ONCE(w->cb.callback_fn);
  As I understand callback_fn is fine, prog might be, but we clearly
  have an inconstency between "prog" and "callback_fn" as they can come
  from 2 different bpf_wq_set_callback() calls.
IMO we should protect this by the async->lock, but I'm not sure if
  it's OK or not.
---
For reference, the use cases I have in mind:
---
Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
   Sometimes we receive an out of proximity event, but the device can not
   be trusted enough, and we need to ensure that we won't receive another
   one in the following n milliseconds. So we need to wait those n
   milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
   We might want to transform one given event into several. This is the
   case for macro keys where a single key press is supposed to send
   a sequence of key presses. But this could also be used to patch a
   faulty behavior, if a device forgets to send a release event.
3. communicate with the device in reaction to one event:
   We might want to communicate back to the device after a given event.
   For example a device might send us an event saying that it came back
   from sleeping state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.
bpf_timers are currently running in a soft IRQ context, this patch
series implements a sleppable context for them.
Cheers,
Benjamin
To: Alexei Starovoitov <ast@...nel.org>
To: Daniel Borkmann <daniel@...earbox.net>
To: Andrii Nakryiko <andrii@...nel.org>
To: Martin KaFai Lau <martin.lau@...ux.dev>
To: Eduard Zingerman <eddyz87@...il.com>
To: Song Liu <song@...nel.org>
To: Yonghong Song <yonghong.song@...ux.dev>
To: John Fastabend <john.fastabend@...il.com>
To: KP Singh <kpsingh@...nel.org>
To: Stanislav Fomichev <sdf@...gle.com>
To: Hao Luo <haoluo@...gle.com>
To: Jiri Olsa <jolsa@...nel.org>
To: Mykola Lysenko <mykolal@...com>
To: Shuah Khan <shuah@...nel.org>
Cc:  <bpf@...r.kernel.org>
Cc:  <linux-kernel@...r.kernel.org>
Cc:  <linux-kselftest@...r.kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss@...nel.org>
[0] https://lore.kernel.org/all/20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org/
---
Benjamin Tissoires (18):
      bpf: trampoline: export __bpf_prog_enter/exit_recur
      bpf: make timer data struct more generic
      bpf: replace bpf_timer_init with a generic helper
      bpf: replace bpf_timer_set_callback with a generic helper
      bpf: replace bpf_timer_cancel_and_free with a generic helper
      bpf: add support for bpf_wq user type
      tools: sync include/uapi/linux/bpf.h
      bpf: add support for KF_ARG_PTR_TO_WORKQUEUE
      bpf: allow struct bpf_wq to be embedded in arraymaps and hashmaps
      selftests/bpf: add bpf_wq tests
      bpf: wq: add bpf_wq_init
      tools: sync include/uapi/linux/bpf.h
      selftests/bpf: wq: add bpf_wq_init() checks
      bpf/verifier: add is_sleepable argument to push_callback_call
      bpf: wq: add bpf_wq_set_callback_impl
      selftests/bpf: add checks for bpf_wq_set_callback()
      bpf: add bpf_wq_start
      selftests/bpf: wq: add bpf_wq_start() checks
 include/linux/bpf.h                                |  17 +-
 include/linux/bpf_verifier.h                       |   1 +
 include/uapi/linux/bpf.h                           |  13 +
 kernel/bpf/arraymap.c                              |  18 +-
 kernel/bpf/btf.c                                   |  17 +
 kernel/bpf/hashtab.c                               |  55 ++-
 kernel/bpf/helpers.c                               | 371 ++++++++++++++++-----
 kernel/bpf/syscall.c                               |  16 +-
 kernel/bpf/trampoline.c                            |   6 +-
 kernel/bpf/verifier.c                              | 195 ++++++++++-
 tools/include/uapi/linux/bpf.h                     |  13 +
 tools/testing/selftests/bpf/bpf_experimental.h     |   7 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c        |   5 +
 .../selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h  |   1 +
 tools/testing/selftests/bpf/prog_tests/wq.c        |  41 +++
 tools/testing/selftests/bpf/progs/wq.c             | 192 +++++++++++
 tools/testing/selftests/bpf/progs/wq_failures.c    | 197 +++++++++++
 17 files changed, 1052 insertions(+), 113 deletions(-)
---
base-commit: ffa6b26b4d8a0520b78636ca9373ab842cb3b1a8
change-id: 20240411-bpf_wq-fe24e8d24f5e
Best regards,
-- 
Benjamin Tissoires <bentiss@...nel.org>
Powered by blists - more mailing lists
 
