linux-kernel - Re: [RFC PATCH v2 1/7] bpf: Introduce BPF_PROG_TYPE_VNET

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPhsuW5KLgt_gsih7zi+T99iYVbt7hk7=OCwYzin-H3=OhF54Q@mail.gmail.com>
Date:   Sun, 19 Nov 2023 13:02:33 -0800
From:   Song Liu <song@...nel.org>
To:     Akihiko Odaki <akihiko.odaki@...nix.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Jason Wang <jasowang@...hat.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Yonghong Song <yonghong.song@...ux.dev>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
        Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>,
        bpf <bpf@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>,
        Yuri Benditovich <yuri.benditovich@...nix.com>,
        Andrew Melnychenko <andrew@...nix.com>
Subject: Re: [RFC PATCH v2 1/7] bpf: Introduce BPF_PROG_TYPE_VNET_HASH

On Sun, Nov 19, 2023 at 12:03 AM Akihiko Odaki <akihiko.odaki@...nix.com> wrote:
>
[...]
>
> Unfortunately no. The communication with the userspace can be done with
> two different means:
> - usual socket read/write
> - vhost for direct interaction with a KVM guest
>
> The BPF map may be a valid option for socket read/write, but it is not
> for vhost. In-kernel vhost may fetch hash from the BPF map, but I guess
> it's not a standard way to have an interaction between the kernel code
> and a BPF program.

I am very new to areas like vhost and KVM. So I don't really follow.
Does this mean we have the guest kernel reading data from host eBPF
programs (loaded by Qemu)?

> >
> >>
> >> Unfortunately, however, it is not acceptable for the BPF subsystem
> >> because the "stable" BPF is completely fixed these days. The
> >> "unstable/kfunc" BPF is an alternative, but the eBPF program will be
> >> shipped with a portable userspace program (QEMU)[1] so the lack of
> >> interface stability is not tolerable.
> >
> > bpf kfuncs are as stable as exported symbols. Is exported symbols
> > like stability enough for the use case? (I would assume yes.)
> >
> >>
> >> Another option is to hardcode the algorithm that was conventionally
> >> implemented with eBPF steering program in the kernel[2]. It is possible
> >> because the algorithm strictly follows the virtio-net specification[3].
> >> However, there are proposals to add different algorithms to the
> >> specification[4], and hardcoding the algorithm to the kernel will
> >> require to add more UAPIs and code each time such a specification change
> >> happens, which is not good for tuntap.
> >
> > The requirement looks similar to hid-bpf. Could you explain why that
> > model is not enough? HID also requires some stability AFAICT.
>
> I have little knowledge with hid-bpf, but I assume it is more like a
> "safe" kernel module; in my understanding, it affects the system state
> and is intended to be loaded with some kind of a system daemon. It is
> fine to have the same lifecycle with the kernel for such a BPF program;
> whenever the kernel is updated, the distributor can recompile the BPF
> program with the new kernel headers and ship it along with the kernel
> just as like a kernel module.
>
> In contrast, our intended use case is more like a normal application.
> So, for example, a user may download a container and run QEMU (including
> the BPF program) installed in the container. As such, it is nice if the
> ABI is stable across kernel releases, but it is not guaranteed for
> kfuncs. Such a use case is already covered with the eBPF steering
> program so I want to maintain it if possible.

TBH, I don't think stability should be a concern for kfuncs used by QEMU.
Many core BPF APIs are now implemented as kfuncs: bpf_dynptr_*,
bpf_rcu_*, etc. As long as there are valid use cases,these kfuncs will
be supported.

Thanks,
Song