lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQ+CCOw9_LbCAaFz0593eydKNb7RxnGr6_FatUOKmvPmBg@mail.gmail.com>
Date: Tue, 13 Jun 2023 21:19:45 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Stanislav Fomichev <sdf@...gle.com>
Cc: Toke Høiland-Jørgensen <toke@...nel.org>, 
	bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, 
	Martin KaFai Lau <martin.lau@...ux.dev>, Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>, 
	John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, 
	Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>, 
	Willem de Bruijn <willemb@...gle.com>, David Ahern <dsahern@...nel.org>, 
	"Karlsson, Magnus" <magnus.karlsson@...el.com>, Björn Töpel <bjorn@...nel.org>, 
	"Fijalkowski, Maciej" <maciej.fijalkowski@...el.com>, Network Development <netdev@...r.kernel.org>
Subject: Re: [RFC bpf-next 0/7] bpf: netdev TX metadata

On Tue, Jun 13, 2023 at 4:16 PM Stanislav Fomichev <sdf@...gle.com> wrote:
>
> On Tue, Jun 13, 2023 at 3:32 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > On Tue, Jun 13, 2023 at 2:17 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > >
> > > > >> >> > --- UAPI ---
> > > > >> >> >
> > > > >> >> > The hooks are implemented in a HID-BPF style. Meaning they don't
> > > > >> >> > expose any UAPI and are implemented as tracing programs that call
> > > > >> >> > a bunch of kfuncs. The attach/detach operation happen via BPF syscall
> > > > >> >> > programs. The series expands device-bound infrastructure to tracing
> > > > >> >> > programs.
> > > > >> >>
> > > > >> >> Not a fan of the "attach from BPF syscall program" thing. These are part
> > > > >> >> of the XDP data path API, and I think we should expose them as proper
> > > > >> >> bpf_link attachments from userspace with introspection etc. But I guess
> > > > >> >> the bpf_mprog thing will give us that?
> > > > >> >
> > > > >> > bpf_mprog will just make those attach kfuncs return the link fd. The
> > > > >> > syscall program will still stay :-(
> > > > >>
> > > > >> Why does the attachment have to be done this way, exactly? Couldn't we
> > > > >> just use the regular bpf_link attachment from userspace? AFAICT it's not
> > > > >> really piggy-backing on the function override thing anyway when the
> > > > >> attachment is per-dev? Or am I misunderstanding how all this works?
> > > > >
> > > > > It's UAPI vs non-UAPI. I'm assuming kfunc makes it non-UAPI and gives
> > > > > us an opportunity to fix things.
> > > > > We can do it via a regular syscall path if there is a consensus.
> > > >
> > > > Yeah, the API exposed to the BPF program is kfunc-based in any case. If
> > > > we were to at some point conclude that this whole thing was not useful
> > > > at all and deprecate it, it doesn't seem to me that it makes much
> > > > difference whether that means "you can no longer create a link
> > > > attachment of this type via BPF_LINK_CREATE" or "you can no longer
> > > > create a link attachment of this type via BPF_PROG_RUN of a syscall type
> > > > program" doesn't really seem like a significant detail to me...
> > >
> > > In this case, why do you prefer it to go via regular syscall? Seems
> > > like we can avoid a bunch of boileplate syscall work with a kfunc that
> > > does the attachment?
> > > We might as well abstract it at, say, libbpf layer which would
> > > generate/load this small bpf program to call a kfunc.
> >
> > I'm not sure we're on the same page here.
> > imo using syscall bpf prog that calls kfunc to do a per-device attach
> > is an overkill here.
> > It's an experimental feature, but you're already worried about
> > multiple netdevs?
> >
> > Can you add an empty nop function and attach to it tracing style
> > with fentry ?
> > It won't be per-netdev, but do you have to do per-device demux
> > by the kernel? Can your tracing bpf prog do that instead?
> > It's just an ifindex compare.
> > This way than non-uapi bits will be even smaller and no need
> > to change struct netdevice.
>
> It's probably going to work if each driver has a separate set of tx
> fentry points, something like:
>   {veth,mlx5,etc}_devtx_submit()
>   {veth,mlx5,etc}_devtx_complete()

Right. And per-driver descriptors.
The 'generic' xdptx metadata is unlikely to be practical.
Marshaling in and out of it is going to be too perf sensitive.
I'd just add an attach point in the driver with enough
args for bpf progs to make sense of the context and extend
the verifier to make few safe fields writeable.
kfuncs to read/request timestamp are probably too slow.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ