[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878rtgydzt.fsf@toke.dk>
Date: Fri, 11 Mar 2022 11:34:14 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>,
bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <kafai@...com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Lorenzo Bianconi <lorenzo@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Jakub Kicinski <kuba@...nel.org>, Lorenz Bauer <linux@....io>,
Networking <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
Kumar Kartikeya Dwivedi <memxor@...il.com> writes:
> On Fri, Mar 11, 2022 at 05:00:42AM IST, Toke Høiland-Jørgensen wrote:
>> Andrii Nakryiko <andrii.nakryiko@...il.com> writes:
>>
>> > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>> >>
>> >> Kumar Kartikeya Dwivedi <memxor@...il.com> writes:
>> >>
>> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
>> >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@...il.com> wrote:
>> >> >> >
>> >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
>> >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
>> >> >> > enable DPA without having to use memcpy (currently the case in
>> >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
>> >> >> >
>> >> >> > The intended usage to read and write data for multi-buff XDP is:
>> >> >> >
>> >> >> > int err = 0;
>> >> >> > char buf[N];
>> >> >> >
>> >> >> > off &= 0xffff;
>> >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
>> >> >> > if (unlikely(!ptr)) {
>> >> >> > if (err < 0)
>> >> >> > return XDP_ABORTED;
>> >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
>> >> >> > if (err < 0)
>> >> >> > return XDP_ABORTED;
>> >> >> > ptr = buf;
>> >> >> > }
>> >> >> > ...
>> >> >> > // Do some stores and loads in [ptr, ptr + N) region
>> >> >> > ...
>> >> >> > if (unlikely(ptr == buf)) {
>> >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
>> >> >> > if (err < 0)
>> >> >> > return XDP_ABORTED;
>> >> >> > }
>> >> >> >
>> >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
>> >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
>> >> >> > it is also more meaningful to the user to see return value as R0=pkt.
>> >> >> >
>> >> >> > This series is meant to collect feedback on the approach, next version can
>> >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
>> >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
>> >> >> > like check_mem_region_access (var_off + off < range), since there would be no
>> >> >> > data_end to compare against and obtain a new range.
>> >> >> >
>> >> >> > The common name and func_id is supposed to allow writing generic code using
>> >> >> > bpf_packet_pointer that works for both XDP and TC programs.
>> >> >> >
>> >> >> > Please see the individual patches for implementation details.
>> >> >> >
>> >> >>
>> >> >> Joanne is working on a "bpf_dynptr" framework that will support
>> >> >> exactly this feature, in addition to working with dynamically
>> >> >> allocated memory, working with memory of statically unknown size (but
>> >> >> safe and checked at runtime), etc. And all that within a generic
>> >> >> common feature implemented uniformly within the verifier. E.g., it
>> >> >> won't need any of the custom bits of logic added in patch #2 and #3.
>> >> >> So I'm thinking that instead of custom-implementing a partial case of
>> >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
>> >> >> and do it only once there?
>> >> >>
>> >> >
>> >> > Interesting stuff, looking forward to it.
>> >> >
>> >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
>> >> >> where input constant is used to characterize some pointer returned
>> >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
>> >> >> that for bpf_dynptr for exactly this "give me direct access of N
>> >> >> bytes, if possible" case. So improving/generalizing it now before
>> >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
>> >> >> feature itself.
>> >> >
>> >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
>> >> > changes into a separate patch, and wait for dynptr to be posted before reworking
>> >> > this.
>> >>
>> >> This does raise the question of what we do in the meantime, though? Your
>> >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
>> >> making it, really has to go in before those hit a release and become
>> >> UAPI.
>> >>
>> >> One option would be to still make the change to those other helpers;
>> >> they'd become a bit slower, but if we have a solution for that coming,
>> >> that may be OK for a single release? WDYT?
>> >
>> > I must have missed important changes to bpf_xdp_{load,store}_bytes().
>> > Does anything change about its behavior? If there are some fixes
>> > specific to those helpers, we should fix them as well as a separate
>> > patch. My main objection is adding a bpf_packet_pointer() special case
>> > when we have a generic mechanism in the works that will come this use
>> > case (among other use cases).
>>
>> Well it's not a functional change per se, but Kartikeya's patch is
>> removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
>> use of the bpf_xdp_pointer()) in favour of making it available directly
>> to BPF. So if we don't do that change before those helpers are
>> finalised, we will end up either introducing a performance regression
>> for code using those helpers, or being stuck with the bpf_xdp_pointer()
>> use inside them even though it makes more sense to move it out to BPF.
>>
>
> So IIUC, the case we're worried about is when a linear region is in head or a
> frag and bpf_xdp_pointer can be used to do a direct memcpy for it. But in my
> testing there doesn't seem to be any difference. With or without the call, the
> time taken e.g. for bpf_xdp_load_bytes lies in the 30-40ns range. It would make
> sense, because for this case the code in bpf_xdp_pointer and bpf_xdp_copy_buf
> are almost the same, just that the latter has a conditional jump out of the loop
> based on len. bpf_xdp_copy_buf is still only doing a single memcpy, the cost
> seems to be dominated by that.
>
> Otoh, removing it would improve the case for the other scenario (when region
> touches two or more frags) because we wouldn't spend time in bpf_xdp_pointer and
> returning NULL from it failing to find a linear region, but that shouldn't be a
> regression.
Yeah, that was basically what I was worried about; thanks for testing!
So this implies that the current use of the bpf_xdp_pointer() helper
function is pretty pointless, right? But at least it's an internal
detail so there's no hurry in fixing it...
-Toke
Powered by blists - more mailing lists