[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzabg=YsiR6re3XLxFAptFW3sECA4v2_e0AE_TRNsDWm-w@mail.gmail.com>
Date: Fri, 3 Feb 2023 13:37:46 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Martin KaFai Lau <martin.lau@...ux.dev>,
Joanne Koong <joannelkoong@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Network Development <netdev@...r.kernel.org>,
Kumar Kartikeya Dwivedi <memxor@...il.com>,
Kernel Team <kernel-team@...com>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH v9 bpf-next 3/5] bpf: Add skb dynptrs
On Thu, Feb 2, 2023 at 3:43 AM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Wed, Feb 1, 2023 at 5:21 PM Andrii Nakryiko
> <andrii.nakryiko@...il.com> wrote:
> >
> > On Tue, Jan 31, 2023 at 4:40 PM Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> > >
> > > On Tue, Jan 31, 2023 at 04:11:47PM -0800, Andrii Nakryiko wrote:
> > > > >
> > > > > When prog is just parsing the packet it doesn't need to finalize with bpf_dynptr_write.
> > > > > The prog can always write into the pointer followed by if (p == buf) bpf_dynptr_write.
> > > > > No need for rdonly flag, but extra copy is there in case of cloned which
> > > > > could have been avoided with extra rd_only flag.
> > > >
> > > > Yep, given we are designing bpf_dynptr_slice for performance, extra
> > > > copy on reads is unfortunate. ro/rw flag or have separate
> > > > bpf_dynptr_slice_rw vs bpf_dynptr_slice_ro?
> > >
> > > Either flag or two kfuncs sound good to me.
> >
> > Would it make sense to make bpf_dynptr_slice() as read-only variant,
> > and bpf_dynptr_slice_rw() for read/write? I think the common case is
> > read-only, right? And if users mistakenly use bpf_dynptr_slice() for
> > r/w case, they will get a verifier error when trying to write into the
> > returned pointer. While if we make bpf_dynptr_slice() as read-write,
> > users won't realize they are paying a performance penalty for
> > something that they don't actually need.
>
> Makes sense and it matches skb_header_pointer() usage in the kernel
> which is read-only. Since there is no verifier the read-only-ness
> is not enforced, but we can do it.
>
> Looks like we've converged on bpf_dynptr_slice() and bpf_dynptr_slice_rw().
> The question remains what to do with bpf_dynptr_data() backed by skb/xdp.
> Should we return EINVAL to discourage its usage?
> Of course, we can come up with sensible behavior for bpf_dynptr_data(),
> but it will have quirks that will be not easy to document.
> Even with extensive docs the users might be surprised by the behavior.
I feel like having bpf_dynptr_data() working in the common case for
skb/xdp would be nice (e.g., so basically at least work in cases when
we don't need to pull).
But we've been discussing bpf_dynptr_slice() with Joanne today, and we
came to the conclusion that bpf_dynptr_slice()/bpf_dynptr_slice_rw()
should work for any kind of dynptr (LOCAL, RINGBUF, SKB, XDP). So
generic code that wants to work with any dynptr would be able to just
use bpf_dynptr_slice, even for LOCAL/RINGBUF, even though buffer won't
ever be filled for LOCAL/RINGBUF.
In application, though, if I know I'm working with LOCAL or RINGBUF
(or MALLOC, once we have it), I'd use bpf_dynptr_data() to fill out
fixed parts, of course. bpf_dynptr_slice() would be cumbersome for
such cases (especially if I have some huge fixed part that I *know* is
available in RINGBUF/MALLOC case).
With this setup we probably won't ever need bpf_dynptr_data_rdonly(),
because we can say to use bpf_dynptr_slice() for that (even with an
unnecessary buffer).
Powered by blists - more mailing lists