lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y7b4Pj0ASpV7Z8TS@C02YVCJELVCG.dhcp.broadcom.net>
Date:   Thu, 5 Jan 2023 11:18:06 -0500
From:   Andy Gospodarek <andrew.gospodarek@...adcom.com>
To:     Tariq Toukan <ttoukan.linux@...il.com>
Cc:     Andy Gospodarek <andrew.gospodarek@...adcom.com>, ast@...nel.org,
        daniel@...earbox.net, davem@...emloft.net, kuba@...nel.org,
        hawk@...nel.org, john.fastabend@...il.com, andrii@...nel.org,
        kafai@...com, songliubraving@...com, yhs@...com,
        kpsingh@...nel.org, toke@...hat.com, lorenzo.bianconi@...hat.com,
        netdev@...r.kernel.org, bpf@...r.kernel.org,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Lorenzo Bianconi <lorenzo@...nel.org>, gal@...dia.com,
        Saeed Mahameed <saeedm@...dia.com>, tariqt@...dia.com
Subject: Re: [PATCH net-next v2] samples/bpf: fixup some tools to be able to
 support xdp multibuffer

On Tue, Jan 03, 2023 at 02:55:22PM +0200, Tariq Toukan wrote:
> 
> 
> On 21/06/2022 20:54, Andy Gospodarek wrote:
> > This changes the section name for the bpf program embedded in these
> > files to "xdp.frags" to allow the programs to be loaded on drivers that
> > are using an MTU greater than PAGE_SIZE.  Rather than directly accessing
> > the buffers, the packet data is now accessed via xdp helper functions to
> > provide an example for those who may need to write more complex
> > programs.
> > 
> > v2: remove new unnecessary variable
> > 
> 
> Hi,
> 
> I'm trying to understand if there are any assumptions/requirements on the
> length of the xdp_buf linear part when passed to XDP multi-buf programs?
> Can the linear part be empty, with all data residing in the fragments? Is it
> valid?

That's a great question.  The implementation in bnxt_en was based on the
implementation as I understood it in mvneta where the linear area
contained approx the first 4k of data - xdp headroom - dma_offset.  This
means that you have something that looks like this with a 9k MTU:

skb->data	[~3.6k of packet data]
skb->frag[0]	[4k of paket data]
     frag[1]	[remainder of packet data]

At some point, I'd like to take the opportunity to test something like
this:

skb->data	[header only + space for header expansion]
skb->frag[0]	[first 4k of data]
     frag[1]	[second 4k of data]
     frag[2]	[remainder of packet data]

Though this will use a bit more memory, I think it will be much more
performant for data that is ultimately consumed rather than forwarded
by the host as the actual packet data will be aligned on page boundaries.

With the ability to have packets that are handled by an XDP program
span buffers, I would also like to test out whether or not it would be
worthwhile to have standard MTU packets also look like this:

skb->data	[header only + space for header expansion]
skb->frag[0]	[packet data]

I think the overall system performance would be better in the XDP_PASS
case, but until there is data to back this up, that's just speculation. 

> Per the proposed pattern below (calling bpf_xdp_load_bytes() to memcpy
> packet data into a local buffer), no such assumption is required, and an
> xdp_buf created by the driver with an empty linear part is valid.
> 
> However, in the _xdp_tx_iptunnel example program, it fails (returns
> XDP_DROP) in case the headers are not in the linear part.
> 
> Regards,
> Tariq
> 
> > Signed-off-by: Andy Gospodarek <gospo@...adcom.com>
> > Acked-by: John Fastabend <john.fastabend@...il.com>
> > Acked-by: Lorenzo Bianconi <lorenzo@...nel.org>
> > ---
> >   samples/bpf/xdp1_kern.c            | 11 ++++++++---
> >   samples/bpf/xdp2_kern.c            | 11 ++++++++---
> >   samples/bpf/xdp_tx_iptunnel_kern.c |  2 +-
> >   3 files changed, 17 insertions(+), 7 deletions(-)
> > 
> > diff --git a/samples/bpf/xdp1_kern.c b/samples/bpf/xdp1_kern.c
> > index f0c5d95084de..0a5c704badd0 100644
> > --- a/samples/bpf/xdp1_kern.c
> > +++ b/samples/bpf/xdp1_kern.c
> > @@ -39,11 +39,13 @@ static int parse_ipv6(void *data, u64 nh_off, void *data_end)
> >   	return ip6h->nexthdr;
> >   }
> > -SEC("xdp1")
> > +#define XDPBUFSIZE	64
> > +SEC("xdp.frags")
> >   int xdp_prog1(struct xdp_md *ctx)
> >   {
> > -	void *data_end = (void *)(long)ctx->data_end;
> > -	void *data = (void *)(long)ctx->data;
> > +	__u8 pkt[XDPBUFSIZE] = {};
> > +	void *data_end = &pkt[XDPBUFSIZE-1];
> > +	void *data = pkt;
> >   	struct ethhdr *eth = data;
> >   	int rc = XDP_DROP;
> >   	long *value;
> > @@ -51,6 +53,9 @@ int xdp_prog1(struct xdp_md *ctx)
> >   	u64 nh_off;
> >   	u32 ipproto;
> > +	if (bpf_xdp_load_bytes(ctx, 0, pkt, sizeof(pkt)))
> > +		return rc;
> > +
> >   	nh_off = sizeof(*eth);
> >   	if (data + nh_off > data_end)
> >   		return rc;
> > diff --git a/samples/bpf/xdp2_kern.c b/samples/bpf/xdp2_kern.c
> > index d8a64ab077b0..3332ba6bb95f 100644
> > --- a/samples/bpf/xdp2_kern.c
> > +++ b/samples/bpf/xdp2_kern.c
> > @@ -55,11 +55,13 @@ static int parse_ipv6(void *data, u64 nh_off, void *data_end)
> >   	return ip6h->nexthdr;
> >   }
> > -SEC("xdp1")
> > +#define XDPBUFSIZE	64
> > +SEC("xdp.frags")
> >   int xdp_prog1(struct xdp_md *ctx)
> >   {
> > -	void *data_end = (void *)(long)ctx->data_end;
> > -	void *data = (void *)(long)ctx->data;
> > +	__u8 pkt[XDPBUFSIZE] = {};
> > +	void *data_end = &pkt[XDPBUFSIZE-1];
> > +	void *data = pkt;
> >   	struct ethhdr *eth = data;
> >   	int rc = XDP_DROP;
> >   	long *value;
> > @@ -67,6 +69,9 @@ int xdp_prog1(struct xdp_md *ctx)
> >   	u64 nh_off;
> >   	u32 ipproto;
> > +	if (bpf_xdp_load_bytes(ctx, 0, pkt, sizeof(pkt)))
> > +		return rc;
> > +
> >   	nh_off = sizeof(*eth);
> >   	if (data + nh_off > data_end)
> >   		return rc;
> > diff --git a/samples/bpf/xdp_tx_iptunnel_kern.c b/samples/bpf/xdp_tx_iptunnel_kern.c
> > index 575d57e4b8d6..0e2bca3a3fff 100644
> > --- a/samples/bpf/xdp_tx_iptunnel_kern.c
> > +++ b/samples/bpf/xdp_tx_iptunnel_kern.c
> > @@ -212,7 +212,7 @@ static __always_inline int handle_ipv6(struct xdp_md *xdp)
> >   	return XDP_TX;
> >   }
> > -SEC("xdp_tx_iptunnel")
> > +SEC("xdp.frags")
> >   int _xdp_tx_iptunnel(struct xdp_md *xdp)
> >   {
> >   	void *data_end = (void *)(long)xdp->data_end;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ