[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220109022448.bxgatdsx3obvipbu@ast-mbp.dhcp.thefacebook.com>
Date: Sat, 8 Jan 2022 18:24:48 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Network Development <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next v7 1/3] bpf: Add "live packet" mode for XDP in
bpf_prog_run()
On Sat, Jan 08, 2022 at 09:19:41PM +0100, Toke Høiland-Jørgensen wrote:
>
> Sure, totally fine with documenting it. Just seems to me the most
> obvious place to put this is in a new
> Documentation/bpf/prog_test_run.rst file with a short introduction about
> the general BPF_PROG_RUN mechanism, and then a subsection dedicated to
> this facility.
sgtm
> > I guess it's ok-ish to get stuck with 128.
> > It will be uapi that we cannot change though.
> > Are you comfortable with that?
>
> UAPI in what sense? I'm thinking of documenting it like:
>
> "The packet data being supplied as data_in to BPF_PROG_RUN will be used
> for the initial run of the XDP program. However, when running the
> program multiple times (with repeat > 1), only the packet *bounds*
> (i.e., the data, data_end and data_meta pointers) will be reset on each
> invocation, the packet data itself won't be rewritten. The pages
> backing the packets are recycled, but the order depends on the path the
> packet takes through the kernel, making it hard to predict when a
> particular modified page makes it back to the XDP program. In practice,
> this means that if the XDP program modifies the packet payload before
> sending out the packet, it has to be prepared to deal with subsequent
> invocations seeing either the initial data or the already-modified
> packet, in arbitrary order."
>
> I don't think this makes any promises about any particular size of the
> page pool, so how does it constitute UAPI?
Could you explain out-of-order scanario again?
It's possible only if xdp_redirect is done into different netdevs.
Then they can xmit at different times and cycle pages back into
the loop in different order. But TX or REDIRECT into the same netdev
will keep the pages in the same order. So the program can rely on that.
> >
> > reinit doesn't feel necessary.
> > How one would use this interface to send N different packets?
> > The api provides an interface for only one.
>
> By having the XDP program react appropriately. E.g., here is the XDP
> program used by the trafficgen tool to cycle through UDP ports when
> sending out the packets - it just reads the current value and updates
> based on that, so it doesn't matter if it sees the initial page or one
> it already modified:
Sure. I think there is an untapped potential here.
With this live packet prog_run anyone can buy 10G or 100G nic equipped
server and for free transform it into $300k+ IXIA beating machine.
It could be a game changer. pktgen doesn't come close.
I'm thinking about generating and consuming test TCP traffic.
TCP blaster would xmit 1M TCP connections through this live prog_run
into eth0 and consume the traffic returning from "server under test"
via a different XDP program attached to eth0.
The prog_run's xdp prog would need to send SYN, increment sequence number,
and keep sane data in the packets. It could be HTTP request, for example.
To achive this IXIA beating setup the TCP blaster would need a full
understanding of what page pool is doing with the packets.
Just saying "in arbitrary order" is a non starter. It diminishes
this live prog_run into pktgen equivalent which is still useful,
but lots of potential is lost.
> Another question seeing as the merge window is imminent: How do you feel
> about merging this before the merge window? I can resubmit before it
> opens with the updated selftest and documentation, and we can deal with
> any tweaks during the -rcs; or would you rather postpone the whole
> thing until the next cycle?
It's already too late for this merge window, but bpf-next is always open.
Just like it was open for the last year. So please resubmit as soon as
the tests are green and this discussion is over.
Powered by blists - more mailing lists