[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzY7GFVWpMKszVL3t_SRAeGPRG9SgKFO_srCZdavDWFDKg@mail.gmail.com>
Date: Mon, 30 Jan 2023 17:19:26 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Joanne Koong <joannelkoong@...il.com>
Cc: bpf@...r.kernel.org, daniel@...earbox.net, andrii@...nel.org,
martin.lau@...nel.org, ast@...nel.org, netdev@...r.kernel.org,
memxor@...il.com, kernel-team@...com
Subject: Re: [PATCH v9 bpf-next 3/5] bpf: Add skb dynptrs
On Mon, Jan 30, 2023 at 5:13 PM Joanne Koong <joannelkoong@...il.com> wrote:
>
> On Mon, Jan 30, 2023 at 5:06 PM Andrii Nakryiko
> <andrii.nakryiko@...il.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 4:56 PM Joanne Koong <joannelkoong@...il.com> wrote:
> > >
> > > On Mon, Jan 30, 2023 at 4:48 PM Andrii Nakryiko
> > > <andrii.nakryiko@...il.com> wrote:
> > > >
> > > > On Fri, Jan 27, 2023 at 11:18 AM Joanne Koong <joannelkoong@...il.com> wrote:
> > > > >
> > > > > Add skb dynptrs, which are dynptrs whose underlying pointer points
> > > > > to a skb. The dynptr acts on skb data. skb dynptrs have two main
> > > > > benefits. One is that they allow operations on sizes that are not
> > > > > statically known at compile-time (eg variable-sized accesses).
> > > > > Another is that parsing the packet data through dynptrs (instead of
> > > > > through direct access of skb->data and skb->data_end) can be more
> > > > > ergonomic and less brittle (eg does not need manual if checking for
> > > > > being within bounds of data_end).
> > > > >
> > > > > For bpf prog types that don't support writes on skb data, the dynptr is
> > > > > read-only (bpf_dynptr_write() will return an error and bpf_dynptr_data()
> > > > > will return a data slice that is read-only where any writes to it will
> > > > > be rejected by the verifier).
> > > > >
> > > > > For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write()
> > > > > interfaces, reading and writing from/to data in the head as well as from/to
> > > > > non-linear paged buffers is supported. For data slices (through the
> > > > > bpf_dynptr_data() interface), if the data is in a paged buffer, the user
> > > > > must first call bpf_skb_pull_data() to pull the data into the linear
> > > > > portion.
> > > > >
> > > > > Any bpf_dynptr_write() automatically invalidates any prior data slices
> > > > > to the skb dynptr. This is because a bpf_dynptr_write() may be writing
> > > > > to data in a paged buffer, so it will need to pull the buffer first into
> > > > > the head. The reason it needs to be pulled instead of writing directly to
> > > > > the paged buffers is because they may be cloned (only the head of the skb
> > > > > is by default uncloned). As such, any bpf_dynptr_write() will
> > > > > automatically have its prior data slices invalidated, even if the write
> > > > > is to data in the skb head (the verifier has no way of differentiating
> > > > > whether the write is to the head or paged buffers during program load
> > > > > time). Please note as well that any other helper calls that change the
> > > > > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data
> > > > > slices of the skb dynptr as well. The stack trace for this is
> > > > > check_helper_call() -> clear_all_pkt_pointers() ->
> > > > > __clear_all_pkt_pointers() -> mark_reg_unknown().
> > > > >
> > > > > For examples of how skb dynptrs can be used, please see the attached
> > > > > selftests.
> > > > >
> > > > > Signed-off-by: Joanne Koong <joannelkoong@...il.com>
> > > > > ---
> > > > > include/linux/bpf.h | 82 +++++++++------
> > > > > include/linux/filter.h | 18 ++++
> > > > > include/uapi/linux/bpf.h | 37 +++++--
> > > > > kernel/bpf/btf.c | 18 ++++
> > > > > kernel/bpf/helpers.c | 95 ++++++++++++++---
> > > > > kernel/bpf/verifier.c | 185 ++++++++++++++++++++++++++-------
> > > > > net/core/filter.c | 60 ++++++++++-
> > > > > tools/include/uapi/linux/bpf.h | 37 +++++--
> > > > > 8 files changed, 432 insertions(+), 100 deletions(-)
> > > > >
> > > >
> > > > [...]
> > > >
> > > > > static const struct bpf_func_proto bpf_dynptr_write_proto = {
> > > > > @@ -1560,6 +1595,8 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = {
> > > > >
> > > > > BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
> > > > > {
> > > > > + enum bpf_dynptr_type type;
> > > > > + void *data;
> > > > > int err;
> > > > >
> > > > > if (!ptr->data)
> > > > > @@ -1569,10 +1606,36 @@ BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u3
> > > > > if (err)
> > > > > return 0;
> > > > >
> > > > > - if (bpf_dynptr_is_rdonly(ptr))
> > > > > - return 0;
> > > > > + type = bpf_dynptr_get_type(ptr);
> > > > > +
> > > > > + switch (type) {
> > > > > + case BPF_DYNPTR_TYPE_LOCAL:
> > > > > + case BPF_DYNPTR_TYPE_RINGBUF:
> > > > > + if (bpf_dynptr_is_rdonly(ptr))
> > > > > + return 0;
> > > >
> > > > will something break if we return ptr->data for read-only LOCAL/RINGBUF dynptr?
> > >
> > > There will be nothing guarding against direct writes into read-only
> > > LOCAL/RINGBUF dynptrs if we return ptr->data. For skb type dynptrs,
> > > it's guarded by the ptr->data return pointer being marked as
> > > MEM_RDONLY in the verifier if the skb is non-writable.
> > >
> >
> > Ah, so we won't add MEM_RDONLY for bpf_dynptr_data()'s returned
> > PTR_TO_MEM if we know (statically) that dynptr is read-only?
>
> I think you meant will, not won't? If so, then yes we only add
> MEM_RDONLY for the returned data slice if we can pre-determine that
> the dynptr is read-only, else bpf_dynptr_data() will return null.
>
> >
> > Ok, not a big deal, just something that we might want to improve in the future.
>
> I'm curious to hear how you think this could be improved. If we're not
> able to know statically whether the dynptr is read-only or writable,
> then there's no way to enforce it in the verifier before the bpf
> program runs. Or is there some way to do this?
I might be just confused, I thought the conclusion from previous
discussions were that we do know statically if dynptr is read-only? If
that's not the case, then yeah, we can't really do much about this.
Either way, I think this is a small thing, as in practice
LOCAL/RINGBUF dynptrs will always be read-write, right?
>
> >
> > > >
> > > > > +
> > > > > + data = ptr->data;
> > > > > + break;
> > > > > + case BPF_DYNPTR_TYPE_SKB:
> > > > > + {
> > > > > + struct sk_buff *skb = ptr->data;
> > > > >
> > > >
> > > > [...]
Powered by blists - more mailing lists