[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YUSrWiWh57Ys7UdB@lore-desk>
Date: Fri, 17 Sep 2021 16:51:06 +0200
From: Lorenzo Bianconi <lorenzo@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
lorenzo.bianconi@...hat.com, davem@...emloft.net, ast@...nel.org,
daniel@...earbox.net, shayagr@...zon.com, john.fastabend@...il.com,
dsahern@...nel.org, brouer@...hat.com, echaudro@...hat.com,
jasowang@...hat.com, alexander.duyck@...il.com, saeed@...nel.org,
maciej.fijalkowski@...el.com, magnus.karlsson@...el.com,
tirthendu.sarkar@...el.com, toke@...hat.com
Subject: Re: [PATCH v14 bpf-next 00/18] mvneta: introduce XDP multi-buffer
support
> On Fri, 10 Sep 2021 18:14:06 +0200 Lorenzo Bianconi wrote:
> > The two following ebpf helpers (and related selftests) has been introduced:
> > - bpf_xdp_adjust_data:
> > Move xdp_md->data and xdp_md->data_end pointers in subsequent fragments
> > according to the offset provided by the ebpf program. This helper can be
> > used to read/write values in frame payload.
> > - bpf_xdp_get_buff_len:
> > Return the total frame size (linear + paged parts)
>
> > More info about the main idea behind this approach can be found here [1][2].
>
> Is there much critique of the skb helpers we have? My intuition would
> be to follow a similar paradigm from the API perspective. It may seem
> trivial to us to switch between the two but "normal" users could easily
> be confused.
>
> By skb paradigm I mean skb_pull_data() and bpf_skb_load/store_bytes().
>
> Alternatively how about we produce a variation on skb_header_pointer()
> (use on-stack buffer or direct access if the entire region is in one
> frag).
>
> bpf_xdp_adjust_data() seems to add cost to helpers and TBH I'm not sure
> how practical it would be to applications. My understanding is that the
> application is not supposed to make assumptions about the fragment
> geometry, meaning data can be split at any point. Parsing data
> arbitrarily split into buffers is hard if pull() is not an option, let
> alone making such parsing provably correct.
>
> Won't applications end up building something like skb_header_pointer()
> based on bpf_xdp_adjust_data(), anyway? In which case why don't we
> provide them what they need?
Please correct me if I am wrong, here you mean in bpf_xdp_adjust_data()
we are moving the logic to read/write data across fragment boundaries
to the caller. Right.
I do not have a clear view about what could be a real use-case for the helper
(maybe John can help on this), but similar to what you are suggesting, what
about doing something like bpf_skb_load/store_bytes()?
- bpf_xdp_load_bytes(struct xdp_buff *xdp_md, u32 offset, u32 len,
void *data)
- bpf_xdp_store_bytes(struct xdp_buff *xdp_md, u32 offset, u32 len,
void *data)
the helper can take care of reading/writing across fragment boundaries
and remove any layout info from the caller. The only downside here
(as for bpf_skb_load/store_bytes()) is we need to copy. But in a
real application, is it actually an issue? (since we have much less
pps for xdp multi-buff).
Moreover I do not know if this solution will requires some verifier
changes.
@John: can this approach works in your use-case?
Anyway I think we should try to get everyone on the same page here since the
helper can change according to specific use-case. Since this series is on the
agenda for LPC next week, I hope you and others who have an opinion about this
will find the time to come and discuss it during the conference :)
Regards,
Lorenzo
>
> say:
>
> void *xdp_mb_pointer(struct xdp_buff *xdp_md, u32 flags,
> u32 offset, u32 len, void *stack_buf)
>
> flags and offset can be squashed into one u64 as needed. Helper returns
> pointer to packet data, either real one or stack_buf. Verifier has to
> be taught that the return value is NULL or a pointer which is safe with
> offsets up to @len.
>
> If the reason for access is write we'd also need:
>
> void *xdp_mb_pointer_flush(struct xdp_buff *xdp_md, u32 flags,
> u32 offset, u32 len, void *stack_buf)
>
> Same inputs, if stack buffer was used it does write back, otherwise nop.
>
> Sorry for the longish email if I'm missing something obvious and/or
> discussed earlier.
>
>
> The other thing I wanted to double check - was the decision on program
> compatibility made? Is a new program type an option? It'd be extremely
> useful operationally to be able to depend on kernel enforcement.
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists