[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBLPtszlDe74yTlk@mini-arch>
Date: Wed, 30 Apr 2025 18:34:46 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Jon Kohler <jon@...anix.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Jason Wang <jasowang@...hat.com>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Simon Horman <horms@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: Re: [PATCH net-next v2] xdp: Add helpers for head length, headroom,
and metadata length
On 05/01, Jon Kohler wrote:
>
>
> > On Apr 30, 2025, at 5:09 PM, Stanislav Fomichev <stfomichev@...il.com> wrote:
> >
> > !-------------------------------------------------------------------|
> > CAUTION: External Email
> >
> > |-------------------------------------------------------------------!
> >
> > On 04/30, Jon Kohler wrote:
> >> Introduce new XDP helpers:
> >> - xdp_headlen: Similar to skb_headlen
> >> - xdp_headroom: Similar to skb_headroom
> >> - xdp_metadata_len: Similar to skb_metadata_len
> >>
> >> Integrate these helpers into tap, tun, and XDP implementation to start.
> >>
> >> No functional changes introduced.
> >>
> >> Signed-off-by: Jon Kohler <jon@...anix.com>
> >> ---
> >> v1->v2: Integrate feedback from Willem
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_project_netdevbpf_patch_20250430182921.1704021-2D1-2Djon-40nutanix.com_&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=9pdxzQszX_M0K3gEPeYOyMZZYSkRR8IMvxslS8320Eoctk58y-ELCdZ5iaryF2GH&s=J-ILB7E9VQ_plo0hyjEtzGzjy6G0_o4MMMmmE_z8vvc&e=
> >>
> >> drivers/net/tap.c | 6 +++---
> >> drivers/net/tun.c | 12 +++++------
> >> include/net/xdp.h | 54 +++++++++++++++++++++++++++++++++++++++++++----
> >> net/core/xdp.c | 12 +++++------
> >> 4 files changed, 65 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/drivers/net/tap.c b/drivers/net/tap.c
> >> index d4ece538f1b2..a62fbca4b08f 100644
> >> --- a/drivers/net/tap.c
> >> +++ b/drivers/net/tap.c
> >> @@ -1048,7 +1048,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
> >> struct sk_buff *skb;
> >> int err, depth;
> >>
> >> - if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
> >> + if (unlikely(xdp_headlen(xdp) < ETH_HLEN)) {
> >> err = -EINVAL;
> >> goto err;
> >> }
> >> @@ -1062,8 +1062,8 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
> >> goto err;
> >> }
> >>
> >> - skb_reserve(skb, xdp->data - xdp->data_hard_start);
> >> - skb_put(skb, xdp->data_end - xdp->data);
> >> + skb_reserve(skb, xdp_headroom(xdp));
> >> + skb_put(skb, xdp_headlen(xdp));
> >>
> >> skb_set_network_header(skb, ETH_HLEN);
> >> skb_reset_mac_header(skb);
> >> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> >> index 7babd1e9a378..4c47eed71986 100644
> >> --- a/drivers/net/tun.c
> >> +++ b/drivers/net/tun.c
> >> @@ -1567,7 +1567,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog,
> >> dev_core_stats_rx_dropped_inc(tun->dev);
> >> return err;
> >> }
> >> - dev_sw_netstats_rx_add(tun->dev, xdp->data_end - xdp->data);
> >> + dev_sw_netstats_rx_add(tun->dev, xdp_headlen(xdp));
> >> break;
> >> case XDP_TX:
> >> err = tun_xdp_tx(tun->dev, xdp);
> >> @@ -1575,7 +1575,7 @@ static int tun_xdp_act(struct tun_struct *tun, struct bpf_prog *xdp_prog,
> >> dev_core_stats_rx_dropped_inc(tun->dev);
> >> return err;
> >> }
> >> - dev_sw_netstats_rx_add(tun->dev, xdp->data_end - xdp->data);
> >> + dev_sw_netstats_rx_add(tun->dev, xdp_headlen(xdp));
> >> break;
> >> case XDP_PASS:
> >> break;
> >> @@ -2355,7 +2355,7 @@ static int tun_xdp_one(struct tun_struct *tun,
> >> struct xdp_buff *xdp, int *flush,
> >> struct tun_page *tpage)
> >> {
> >> - unsigned int datasize = xdp->data_end - xdp->data;
> >> + unsigned int datasize = xdp_headlen(xdp);
> >> struct tun_xdp_hdr *hdr = xdp->data_hard_start;
> >> struct virtio_net_hdr *gso = &hdr->gso;
> >> struct bpf_prog *xdp_prog;
> >> @@ -2415,14 +2415,14 @@ static int tun_xdp_one(struct tun_struct *tun,
> >> goto out;
> >> }
> >>
> >> - skb_reserve(skb, xdp->data - xdp->data_hard_start);
> >> - skb_put(skb, xdp->data_end - xdp->data);
> >> + skb_reserve(skb, xdp_headroom(xdp));
> >> + skb_put(skb, xdp_headlen(xdp));
> >>
> >> /* The externally provided xdp_buff may have no metadata support, which
> >> * is marked by xdp->data_meta being xdp->data + 1. This will lead to a
> >> * metasize of -1 and is the reason why the condition checks for > 0.
> >> */
> >> - metasize = xdp->data - xdp->data_meta;
> >> + metasize = xdp_metadata_len(xdp);
> >> if (metasize > 0)
> >> skb_metadata_set(skb, metasize);
> >>
> >> diff --git a/include/net/xdp.h b/include/net/xdp.h
> >> index 48efacbaa35d..044345b18305 100644
> >> --- a/include/net/xdp.h
> >> +++ b/include/net/xdp.h
> >> @@ -151,10 +151,56 @@ xdp_get_shared_info_from_buff(const struct xdp_buff *xdp)
> >> return (struct skb_shared_info *)xdp_data_hard_end(xdp);
> >> }
> >>
> >> +/**
> >> + * xdp_headlen - Calculate the length of the data in an XDP buffer
> >> + * @xdp: Pointer to the XDP buffer structure
> >> + *
> >> + * Compute the length of the data contained in the XDP buffer. Does not
> >> + * include frags, use xdp_get_buff_len() for that instead.
> >> + *
> >> + * Analogous to skb_headlen().
> >> + *
> >> + * Return: The length of the data in the XDP buffer in bytes.
> >> + */
> >> +static inline unsigned int xdp_headlen(const struct xdp_buff *xdp)
> >> +{
> >> + return xdp->data_end - xdp->data;
> >> +}
> >> +
> >> +/**
> >> + * xdp_headroom - Calculate the headroom available in an XDP buffer
> >> + * @xdp: Pointer to the XDP buffer structure
> >> + *
> >> + * Compute the headroom in an XDP buffer.
> >> + *
> >> + * Analogous to the skb_headroom().
> >> + *
> >> + * Return: The size of the headroom in bytes.
> >> + */
> >> +static inline unsigned int xdp_headroom(const struct xdp_buff *xdp)
> >> +{
> >> + return xdp->data - xdp->data_hard_start;
> >> +}
> >> +
> >> +/**
> >> + * xdp_metadata_len - Calculate the length of metadata in an XDP buffer
> >> + * @xdp: Pointer to the XDP buffer structure
> >> + *
> >> + * Compute the length of the metadata region in an XDP buffer.
> >> + *
> >> + * Analogous to skb_metadata_len().
> >> + *
> >> + * Return: The length of the metadata in bytes.
> >> + */
> >> +static inline unsigned int xdp_metadata_len(const struct xdp_buff *xdp)
> >
> > I believe this has to return int, not unsigned int. There are places
> > where we do data_meta = data + 1, and the callers check whether
> > the result is signed or sunsigned.
>
> The existing SKB APIs are the exact same return type (I copied them 1:1),
> but I have a feeling that we’re never intending these values to either A) be
> negative and/or B) wrap in strange ways?
>
> >
> >> +{
> >> + return xdp->data - xdp->data_meta;
> >> +}
> >> +
> >> static __always_inline unsigned int
> >> xdp_get_buff_len(const struct xdp_buff *xdp)
> >> {
> >> - unsigned int len = xdp->data_end - xdp->data;
> >> + unsigned int len = xdp_headlen(xdp);
> >> const struct skb_shared_info *sinfo;
> >>
> >> if (likely(!xdp_buff_has_frags(xdp)))
> >> @@ -364,8 +410,8 @@ int xdp_update_frame_from_buff(const struct xdp_buff *xdp,
> >> int metasize, headroom;
>
> Said another way, perhaps this should be unsigned?
>
> >>
> >> /* Assure headroom is available for storing info */
> >> - headroom = xdp->data - xdp->data_hard_start;
> >> - metasize = xdp->data - xdp->data_meta;
> >> + headroom = xdp_headroom(xdp);
> >> + metasize = xdp_metadata_len(xdp);
> >> metasize = metasize > 0 ? metasize : 0;
> >
> > ^^ like here
>
> Look across the tree, seems like more are unsigned than signed
The ones that are unsigned are either calling xdp_data_meta_unsupported
explicitly (and it does > to check for this condition, not signed math)
or are running in the drivers that are guaranteed to have metadata
support (and, hence, always have data_meta <= data).
> These ones use unsigned:
> xdp_convert_zc_to_xdp_frame
This uses xdp_data_meta_unsupported
> veth_xdp_rcv_skb
> xsk_construct_skb
> bnxt_copy_xdp
> i40e_build_skb
> i40e_construct_skb_zc
> ice_build_skb (this is u8)
> ice_construct_skb_zc
> igb_build_skb
> igb_construct_skb_zc
> igc_build_skb
> igc_construct_skb
> igc_construct_skb_zc
> ixgbe_build_skb
> ixgbe_construct_skb_zc
> ixgbevf_build_skb
> mvneta_swbm_build_skb
> mlx5e_xsk_construct_skb
> mana_build_skb
> stmmac_construct_skb_zc
> bpf_prog_run_generic_xdp
These run in the drivers that support metadata (data_meta <= data)
> xdp_get_metalen
This uses xdp_data_meta_unsupported
> These ones are regular int:
> xdp_build_skb_from_buff
> xdp_build_skb_from_zc
> xdp_update_frame_from_buff
> tun_xdp_one
> build_skb_from_xdp_buff
These can be called from the drivers that support and don't support
the metadata, so have to (correctly) use int.
> Perhaps a separate patch to convert the regulars to unsigned,
> thoughts?
Take a look at xdp_set_data_meta_invalid and xdp_data_meta_unsupported.
There are cases where xdp->data - xdp->data_meta is -1 (and the callers
check for this condition), we can't use unsigned unconditionally
(unless we use xdp_data_meta_unsupported).
Powered by blists - more mailing lists