[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPhsuW5KLgt_gsih7zi+T99iYVbt7hk7=OCwYzin-H3=OhF54Q@mail.gmail.com>
Date: Sun, 19 Nov 2023 13:02:33 -0800
From: Song Liu <song@...nel.org>
To: Akihiko Odaki <akihiko.odaki@...nix.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Jason Wang <jasowang@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>,
Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
Jonathan Corbet <corbet@....net>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>,
bpf <bpf@...r.kernel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>,
Yuri Benditovich <yuri.benditovich@...nix.com>,
Andrew Melnychenko <andrew@...nix.com>
Subject: Re: [RFC PATCH v2 1/7] bpf: Introduce BPF_PROG_TYPE_VNET_HASH
On Sun, Nov 19, 2023 at 12:03 AM Akihiko Odaki <akihiko.odaki@...nix.com> wrote:
>
[...]
>
> Unfortunately no. The communication with the userspace can be done with
> two different means:
> - usual socket read/write
> - vhost for direct interaction with a KVM guest
>
> The BPF map may be a valid option for socket read/write, but it is not
> for vhost. In-kernel vhost may fetch hash from the BPF map, but I guess
> it's not a standard way to have an interaction between the kernel code
> and a BPF program.
I am very new to areas like vhost and KVM. So I don't really follow.
Does this mean we have the guest kernel reading data from host eBPF
programs (loaded by Qemu)?
> >
> >>
> >> Unfortunately, however, it is not acceptable for the BPF subsystem
> >> because the "stable" BPF is completely fixed these days. The
> >> "unstable/kfunc" BPF is an alternative, but the eBPF program will be
> >> shipped with a portable userspace program (QEMU)[1] so the lack of
> >> interface stability is not tolerable.
> >
> > bpf kfuncs are as stable as exported symbols. Is exported symbols
> > like stability enough for the use case? (I would assume yes.)
> >
> >>
> >> Another option is to hardcode the algorithm that was conventionally
> >> implemented with eBPF steering program in the kernel[2]. It is possible
> >> because the algorithm strictly follows the virtio-net specification[3].
> >> However, there are proposals to add different algorithms to the
> >> specification[4], and hardcoding the algorithm to the kernel will
> >> require to add more UAPIs and code each time such a specification change
> >> happens, which is not good for tuntap.
> >
> > The requirement looks similar to hid-bpf. Could you explain why that
> > model is not enough? HID also requires some stability AFAICT.
>
> I have little knowledge with hid-bpf, but I assume it is more like a
> "safe" kernel module; in my understanding, it affects the system state
> and is intended to be loaded with some kind of a system daemon. It is
> fine to have the same lifecycle with the kernel for such a BPF program;
> whenever the kernel is updated, the distributor can recompile the BPF
> program with the new kernel headers and ship it along with the kernel
> just as like a kernel module.
>
> In contrast, our intended use case is more like a normal application.
> So, for example, a user may download a container and run QEMU (including
> the BPF program) installed in the container. As such, it is nice if the
> ABI is stable across kernel releases, but it is not guaranteed for
> kfuncs. Such a use case is already covered with the eBPF steering
> program so I want to maintain it if possible.
TBH, I don't think stability should be a concern for kfuncs used by QEMU.
Many core BPF APIs are now implemented as kfuncs: bpf_dynptr_*,
bpf_rcu_*, etc. As long as there are valid use cases,these kfuncs will
be supported.
Thanks,
Song
Powered by blists - more mailing lists