netdev - Re: [RFC bpf-next v1 3/7] bpf: Support pulling non-linear xdp data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aKzVsZ0D53rhOhQe@mini-arch>
Date: Mon, 25 Aug 2025 14:29:21 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Amery Hung <ameryhung@...il.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
	alexei.starovoitov@...il.com, andrii@...nel.org,
	daniel@...earbox.net, kuba@...nel.org, martin.lau@...nel.org,
	mohsin.bashr@...il.com, saeedm@...dia.com, tariqt@...dia.com,
	mbloch@...dia.com, maciej.fijalkowski@...el.com,
	kernel-team@...a.com
Subject: Re: [RFC bpf-next v1 3/7] bpf: Support pulling non-linear xdp data

On 08/25, Amery Hung wrote:
> Add kfunc, bpf_xdp_pull_data(), to support pulling data from xdp
> fragments. Similar to bpf_skb_pull_data(), bpf_xdp_pull_data() makes
> the first len bytes of data directly readable and writable in bpf
> programs. If the "len" argument is larger than the linear data size,
> data in fragments will be copied to the linear region when there
> is enough room between xdp->data_end and xdp_data_hard_end(xdp),
> which is subject to driver implementation.
> 
> A use case of the kfunc is to decapsulate headers residing in xdp
> fragments. It is possible for a NIC driver to place headers in xdp
> fragments. To keep using direct packet access for parsing and
> decapsulating headers, users can pull headers into the linear data
> area by calling bpf_xdp_pull_data() and then pop the header with
> bpf_xdp_adjust_head().
> 
> An unused argument, flags is reserved for future extension (e.g.,
> tossing the data instead of copying it to the linear data area).
> 
> Signed-off-by: Amery Hung <ameryhung@...il.com>
> ---
>  net/core/filter.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index f0ee5aec7977..82d953e077ac 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -12211,6 +12211,57 @@ __bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops,
>  	return 0;
>  }
>  
> +__bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len, u64 flags)
> +{
> +	struct xdp_buff *xdp = (struct xdp_buff *)x;
> +	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> +	void *data_end, *data_hard_end = xdp_data_hard_end(xdp);
> +	int i, delta, buff_len, n_frags_free = 0, len_free = 0;
> +
> +	buff_len = xdp_get_buff_len(xdp);
> +
> +	if (unlikely(len > buff_len))
> +		return -EINVAL;
> +
> +	if (!len)
> +		len = xdp_get_buff_len(xdp);

Why not return -EINVAL here for len=0?

> +
> +	data_end = xdp->data + len;
> +	delta = data_end - xdp->data_end;
> +
> +	if (delta <= 0)
> +		return 0;
> +
> +	if (unlikely(data_end > data_hard_end))
> +		return -EINVAL;
> +
> +	for (i = 0; i < sinfo->nr_frags && delta; i++) {
> +		skb_frag_t *frag = &sinfo->frags[i];
> +		u32 shrink = min_t(u32, delta, skb_frag_size(frag));
> +
> +		memcpy(xdp->data_end + len_free, skb_frag_address(frag), shrink);

skb_frag_address can return NULL for unreadable frags.