lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZWnk8BVkkbdiEthm@lzaremba-mobl.ger.corp.intel.com>
Date: Fri, 1 Dec 2023 14:51:44 +0100
From: Larysa Zaremba <larysa.zaremba@...el.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
CC: <bpf@...r.kernel.org>, <ast@...nel.org>, <daniel@...earbox.net>,
	<andrii@...nel.org>, <martin.lau@...ux.dev>, <song@...nel.org>, <yhs@...com>,
	<john.fastabend@...il.com>, <kpsingh@...nel.org>, <sdf@...gle.com>,
	<haoluo@...gle.com>, <jolsa@...nel.org>, David Ahern <dsahern@...il.com>,
	Jakub Kicinski <kuba@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
	Jesper Dangaard Brouer <hawk@...nel.org>, Anatoly Burakov
	<anatoly.burakov@...el.com>, Alexander Lobakin <alexandr.lobakin@...el.com>,
	Magnus Karlsson <magnus.karlsson@...il.com>, Maryam Tahhan
	<mtahhan@...hat.com>, <xdp-hints@...-project.net>, <netdev@...r.kernel.org>,
	Willem de Bruijn <willemdebruijn.kernel@...il.com>, Alexei Starovoitov
	<alexei.starovoitov@...il.com>, Tariq Toukan <tariqt@...lanox.com>, "Saeed
 Mahameed" <saeedm@...lanox.com>
Subject: Re: [PATCH bpf-next v7 04/18] ice: Introduce ice_xdp_buff

On Thu, Nov 16, 2023 at 03:55:19AM +0100, Larysa Zaremba wrote:
> On Thu, Nov 16, 2023 at 01:52:24PM +0100, Maciej Fijalkowski wrote:
> > On Wed, Nov 15, 2023 at 06:52:46PM +0100, Larysa Zaremba wrote:
> > > In order to use XDP hints via kfuncs we need to put
> > > RX descriptor and miscellaneous data next to xdp_buff.
> > > Same as in hints implementations in other drivers, we achieve
> > > this through putting xdp_buff into a child structure.
> > > 
> > > Currently, xdp_buff is stored in the ring structure,
> > > so replace it with union that includes child structure.
> > > This way enough memory is available while existing XDP code
> > > remains isolated from hints.
> > > 
> > > Minimum size of the new child structure (ice_xdp_buff) is exactly
> > > 64 bytes (single cache line). To place it at the start of a cache line,
> > > move 'next' field from CL1 to CL4, as it isn't used often. This still
> > > leaves 192 bits available in CL3 for packet context extensions.
> > > 
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@...el.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_txrx.c     |  7 +++++--
> > >  drivers/net/ethernet/intel/ice/ice_txrx.h     | 18 +++++++++++++++---
> > >  drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 ++++++++++
> > >  3 files changed, 30 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > index 40f2f6dabb81..4e6546d9cf85 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size)
> > >   * @xdp_prog: XDP program to run
> > >   * @xdp_ring: ring to be used for XDP_TX action
> > >   * @rx_buf: Rx buffer to store the XDP action
> > > + * @eop_desc: Last descriptor in packet to read metadata from
> > >   *
> > >   * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}
> > >   */
> > >  static void
> > >  ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > >  	    struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,
> > > -	    struct ice_rx_buf *rx_buf)
> > > +	    struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)
> > >  {
> > >  	unsigned int ret = ICE_XDP_PASS;
> > >  	u32 act;
> > > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > >  	if (!xdp_prog)
> > >  		goto exit;
> > >  
> > > +	ice_xdp_meta_set_desc(xdp, eop_desc);
> > > +
> > >  	act = bpf_prog_run_xdp(xdp_prog, xdp);
> > >  	switch (act) {
> > >  	case XDP_PASS:
> > > @@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget)
> > >  		if (ice_is_non_eop(rx_ring, rx_desc))
> > >  			continue;
> > >  
> > > -		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf);
> > > +		ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);
> > >  		if (rx_buf->act == ICE_XDP_PASS)
> > >  			goto construct_skb;
> > >  		total_rx_bytes += xdp_get_buff_len(xdp);
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
> > > index 166413fc33f4..9efb42f99415 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
> > > @@ -257,6 +257,14 @@ enum ice_rx_dtype {
> > >  	ICE_RX_DTYPE_SPLIT_ALWAYS	= 2,
> > >  };
> > >  
> > > +struct ice_xdp_buff {
> > > +	struct xdp_buff xdp_buff;
> > > +	const union ice_32b_rx_flex_desc *eop_desc;
> > > +};
> > > +
> > > +/* Required for compatibility with xdp_buffs from xsk_pool */
> > > +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0);
> > 
> > That should go to xsk core as a macro and should be used by ZC drivers
> > that support hints. Useful stuff similar to XSK_CEHCK_PRIV_TYPE() but
> > check is from the other end.
> 
> Seems like there will be a v8 anyway, so I might as well do this :)
>

After going back and forth about this for a few days, I think this is not a good 
idea to include such change into the series.

But I do have to change the comment, this assert is mostly needed, 
because my code converts ice_xdp_buff <-> xdp_buff on multiple occasions (and 
this flooding code with container_of is not a solution).

Currently, XSK_CHECK_PRIV_TYPE() performs the most important task of ensuring 
nothing outside cb[] is overwritten, just fine. Offset check macro could be 
added in a separate patch to the xdp header (instead of xsk_buff_pool.h),  ice 
isn't the only driver relying on xdp_buff inheritance. This will happen only if
I (or someone else) can make such generic macro not look ugly, though.

> > 
> > > +
> > >  /* indices into GLINT_ITR registers */
> > >  #define ICE_RX_ITR	ICE_IDX_ITR0
> > >  #define ICE_TX_ITR	ICE_IDX_ITR1
> > > @@ -298,7 +306,6 @@ enum ice_dynamic_itr {
> > >  /* descriptor ring, associated with a VSI */
> > >  struct ice_rx_ring {
> > >  	/* CL1 - 1st cacheline starts here */
> > > -	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
> > >  	void *desc;			/* Descriptor ring memory */
> > >  	struct device *dev;		/* Used for DMA mapping */
> > >  	struct net_device *netdev;	/* netdev ring maps to */
> > > @@ -310,12 +317,16 @@ struct ice_rx_ring {
> > >  	u16 count;			/* Number of descriptors */
> > >  	u16 reg_idx;			/* HW register index of the ring */
> > >  	u16 next_to_alloc;
> > > -	/* CL2 - 2nd cacheline starts here */
> > > +
> > >  	union {
> > >  		struct ice_rx_buf *rx_buf;
> > >  		struct xdp_buff **xdp_buf;
> > >  	};
> > > -	struct xdp_buff xdp;
> > > +	/* CL2 - 2nd cacheline starts here */
> > > +	union {
> > > +		struct ice_xdp_buff xdp_ext;
> > > +		struct xdp_buff xdp;
> > > +	};
> > >  	/* CL3 - 3rd cacheline starts here */
> > >  	struct bpf_prog *xdp_prog;
> > >  	u16 rx_offset;
> > > @@ -332,6 +343,7 @@ struct ice_rx_ring {
> > >  	/* CL4 - 4th cacheline starts here */
> > >  	struct ice_channel *ch;
> > >  	struct ice_tx_ring *xdp_ring;
> > > +	struct ice_rx_ring *next;	/* pointer to next ring in q_vector */
> > >  	struct xsk_buff_pool *xsk_pool;
> > >  	dma_addr_t dma;			/* physical address of ring */
> > >  	u64 cached_phctime;
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> > > index e1d49e1235b3..81b8856d8e13 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h
> > > @@ -151,4 +151,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring,
> > >  		       struct sk_buff *skb);
> > >  void
> > >  ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag);
> > > +
> > > +static inline void
> > > +ice_xdp_meta_set_desc(struct xdp_buff *xdp,
> > > +		      union ice_32b_rx_flex_desc *eop_desc)
> > > +{
> > > +	struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff,
> > > +						    xdp_buff);
> > > +
> > > +	xdp_ext->eop_desc = eop_desc;
> > > +}
> > >  #endif /* !_ICE_TXRX_LIB_H_ */
> > > -- 
> > > 2.41.0
> > > 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ