[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQzZfXz9qBjr5vtB@horms.kernel.org>
Date: Thu, 6 Nov 2025 17:23:09 +0000
From: Simon Horman <horms@...nel.org>
To: Jacob Keller <jacob.e.keller@...el.com>
Cc: Aleksandr Loktionov <aleksandr.loktionov@...el.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>,
intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org
Subject: Re: [PATCH iwl-next v2 2/9] ice: use cacheline groups for
ice_rx_ring structure
On Wed, Nov 05, 2025 at 01:06:34PM -0800, Jacob Keller wrote:
> The ice ring structure was reorganized back by commit 65124bbf980c ("ice:
> Reorganize tx_buf and ring structs"), and later split into a separate
> ice_rx_ring structure by commit e72bba21355d ("ice: split ice_ring onto
> Tx/Rx separate structs")
>
> The ice_rx_ring structure has comments left over from this prior
> reorganization indicating which fields belong to which cachelines.
> Unfortunately, these comments are not all accurate. The intended layout is
> for x86_64 systems with a 64-byte cache.
>
> * Cacheline 1 spans from the start of the struct to the end of the rx_fqes
> and xdp_buf union. The comments correctly match this.
>
> * Cacheline 2 spans from hdr_fqes to the end of hdr_truesize, but the
> comment indicates it should end xdp and xsk union.
>
> * Cacheline 3 spans from the truesize field to the xsk_pool, but the
> comment wants this to be from the pkt_ctx down to the rcu head field.
>
> * Cacheline 4 spans from the rx_hdr_len down to the flags field, but the
> comment indicates that it starts back at the ice_channel structure
> pointer.
>
> * Cacheline 5 is indicated to cover the xdp_rxq. Because this field is
> aligned to 64 bytes, this is actually true. However, there is a large 45
> byte gap at the end of cacheline 4.
>
> The use of comments to indicate cachelines is poor design. In practice,
> comments like this quickly become outdated as developers add or remove
> fields, or as other sub-structures change layout and size unexpectedly.
>
> The ice_rx_ring structure *is* 5 cachelines (320 bytes), but ends up having
> quite a lot of empty space at the end just before xdp_rxq.
>
> Replace the comments with __cacheline_group_(begin|end)_aligned()
> annotations. These macros enforce alignment to the start of cachelines, and
> enforce padding between groups, thus guaranteeing that a following group
> cannot be part of the same cacheline.
>
> Doing this changes the layout by effectively spreading the padding at the
> tail of cacheline 4 between groups to ensure that the relevant fields are
> kept as separate cachelines on x86_64 systems. For systems with the
> expected cache line size of 64 bytes, the structure size does not change.
> For systems with a larger SMB_CACHE_BYTES size, the structure *will*
> increase in size a fair bit, however we'll now guarantee that each group is
> in a separate cacheline. This has an advantage that updates to fields in
> one group won't trigger cacheline eviction of the other groups. This comes
> at the expense of extra memory footprint for the rings.
>
> If fields get re-arranged, added, or removed, the alignment and padding
> ensure the relevant fields are kept on separate cache lines. This could
> result in unexpected changes in the structure size due to padding to keep
> cachelines separate.
>
> To catch such changes during early development, add build time compiler
> assertions that check the size of each group to ensure it doesn't exceed 64
> bytes, the expected cache line size. The assertion checks that the size of
> the group excluding any padding at the end is less than the provided size
> of 64 bytes. This type of check will behave the same even on architectures
> with larger cache sizes. The primary aim is to produce a warning if
> developers add fields into a cacheline group which exceeds the size of the
> expected 64 byte groupings.
>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
> ---
> drivers/net/ethernet/intel/ice/ice_txrx.h | 26 +++++++++++++++++++++-----
> drivers/net/ethernet/intel/ice/ice_main.c | 2 ++
> 2 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
...
> @@ -298,10 +302,22 @@ struct ice_rx_ring {
> #define ICE_RX_FLAGS_MULTIDEV BIT(3)
> #define ICE_RX_FLAGS_RING_GCS BIT(4)
> u8 flags;
> - /* CL5 - 5th cacheline starts here */
> + __cacheline_group_end_aligned(cl4);
> +
> + __cacheline_group_begin_aligned(cl5);
> struct xdp_rxq_info xdp_rxq;
> + __cacheline_group_end_aligned(cl5);
> } ____cacheline_internodealigned_in_smp;
>
> +static inline void ice_rx_ring_struct_check(void)
> +{
> + CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl1, 64);
> + CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl2, 64);
> + CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl3, 64);
> + CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl4, 64);
> + CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl5, 64);
Hi Jacob,
Unfortunately the last line results in a build failure on ARM (32-bit)
with allmodconfig. It seems that in that case the size of the group is
128 bytes.
> +}
> +
> struct ice_tx_ring {
> /* CL1 - 1st cacheline starts here */
> struct ice_tx_ring *next; /* pointer to next ring in q_vector */
...
Powered by blists - more mailing lists