[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <614511bc3408b_8d5120862@john-XPS-13-9370.notmuch>
Date: Fri, 17 Sep 2021 15:07:56 -0700
From: John Fastabend <john.fastabend@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Jakub Kicinski <kuba@...nel.org>
Cc: Lorenzo Bianconi <lorenzo@...nel.org>, bpf <bpf@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
Lorenzo Bianconi <lorenzo.bianconi@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, shayagr@...zon.com,
John Fastabend <john.fastabend@...il.com>,
David Ahern <dsahern@...nel.org>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Eelco Chaudron <echaudro@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Alexander Duyck <alexander.duyck@...il.com>,
Saeed Mahameed <saeed@...nel.org>,
"Fijalkowski, Maciej" <maciej.fijalkowski@...el.com>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>,
tirthendu.sarkar@...el.com,
Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH v14 bpf-next 00/18] mvneta: introduce XDP multi-buffer
support
Alexei Starovoitov wrote:
> On Fri, Sep 17, 2021 at 12:00 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Fri, 17 Sep 2021 11:43:07 -0700 Alexei Starovoitov wrote:
> > > > If bpf_xdp_load_bytes() / bpf_xdp_store_bytes() works for most we
> > > > can start with that. In all honesty I don't know what the exact
> > > > use cases for looking at data are, either. I'm primarily worried
> > > > about exposing the kernel internals too early.
> > >
> > > I don't mind the xdp equivalent of skb_load_bytes,
> > > but skb_header_pointer() idea is superior.
> > > When we did xdp with data/data_end there was no refine_retval_range
> > > concept in the verifier (iirc or we just missed that opportunity).
> > > We'd need something more advanced: a pointer with valid range
> > > refined by input argument 'len' or NULL.
> > > The verifier doesn't have such thing yet, but it fits as a combination of
> > > value_or_null plus refine_retval_range.
> > > The bpf_xdp_header_pointer() and bpf_skb_header_pointer()
> > > would probably simplify bpf programs as well.
> > > There would be no need to deal with data/data_end.
> >
> > What are your thoughts on inlining? Can we inline the common case
> > of the header being in the "head"? Otherwise data/end comparisons
> > would be faster.
>
> Yeah. It can be inlined by the verifier.
> It would still look like a call from bpf prog pov with llvm doing spill/fill
> of scratched regs, but it's minor.
>
> Also we can use the same bpf_header_pointer(ctx, ...)
> helper for both xdp and skb program types. They will have different
> implementation underneath, but this might make possible writing bpf
> programs that could work in both xdp and skb context.
> I believe cilium has fancy macros to achieve that.
Hi,
First a header_pointer() logic that works across skb and xdp seems like
a great idea to me. I wonder though if instead of doing the copy
into a new buffer for offset past the initial frag like what is done in
skb_header_pointer could we just walk the frags and point at the new offset.
This is what we do on the socket side with bpf_msg_pull-data() for example.
For XDP it should also work. The skb case would depend on clone state
and things so might be a bit more tricky there.
This has the advantage of only doing the copy when its necessary. This
can be useful for example when reading the tail of an IPsec packet. With
blind copy most packets will get hit with a copy. By just writing the
pkt->data and pkt->data_end we can avoid this case.
Lorenz originally implemented something similar earlier and we had the
refine retval logic. It failed on no-alu32 for some reason we could
revisit. I didn't mind the current help returning with data pointer set
to the start of the frag so we stopped following up on it.
I agree though the current implementation puts a lot on the BPF writer.
So getting both cases covered, I want to take pains in my BPF prog
to avoid copies and I just want these bytes handled behind a single
helper seems good to me.
Thanks,
John
Powered by blists - more mailing lists