[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251103-jk-refactor-queue-stats-v1-0-164d2ed859b6@intel.com>
Date: Mon, 03 Nov 2025 17:06:45 -0800
From: Jacob Keller <jacob.e.keller@...el.com>
To: Aleksandr Loktionov <aleksandr.loktionov@...el.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org,
Jacob Keller <jacob.e.keller@...el.com>
Subject: [PATCH iwl-next 0/9] ice: properly use u64_stats API for all ring
stats
The ice driver has multiple u64 values stored in the ring structures for
each queue used for statistics. These are accumulated in
ice_update_vsi_stats(). The packet and byte values are read using the
u64_stats API from <linux/u64_stats_sync.h>.
Several non-standard counters are also accumulated in the same function,
but do not use the u64_stats API. This could result in load/store tears on
32-bit architectures. Further, since commit 316580b69d0a ("u64_stats:
provide u64_stats_t type"), the u64 stats API has had u64_stats_t and
access functions which convert to local64_t on 64-bit architectures.
The ice driver doesn't use u64_stats_t and these access functions. Thus
even on 64-bit architectures it could read inconsistent values. This series
refactors the ice driver to use the updated API. Along the way I noticed
several other issues and inconsistencies which I have cleaned up,
summarized below.
*) The driver never called u64_stats_init, leaving the syncp improperly
initialized. Since the field is part of a kzalloc block, this only
impacts 32-bit systens with CONFIG_LOCKDEP enabled.
*) A few locations accessed the packets and byte counts directly without
using the u64 stats API.
*) The prev_pkt integer field is moved out of the stats structure and into
the ice_tx_ring structure directly.
*) Cache line comments in ice_tx_ring and ice_rx_ring were out of date and
did not match the actual intended layout for systems with 64-bit cache
lines. Convert the structures to use __cacheline_group instead of
comments.
*) The ice_fetch_u64_stats_per_ring() function took the ice_q_stats by
value, defeating the point of using the u64_stats API entirely.
To keep the stats increments short, I introduced ice_stats_inc, as
otherwise each stat increment has to be quite verbose. Similarly a few
places read only one stat, so I added ice_stats_read for those.
This version uses struct ice_vsi_(tx|rx)_stats structures defined in
ice_main.c for the accumulator. I haven't come up with a better solution
that allows accumulating nicely without this structure. Its a bit
frustrating as it copies the entries in the ring stats structures but with
u64 instead of u64_stats_t.
I am also still not entirely certain how the ice_update_vsi_ring_stats()
function is synchronized in the ice driver. It is called from multiple
places without an obvious synchronization mechanism. It is ultimately
called from the service task and from ethtool, and I think it may also be
called from one of the netdev stats callbacks.
I'm open to suggestions on ways to improve this, as I think the result
still has some ugly logic and a fair amount of near duplicate code.
I have included the cacheline cleanup in ice_tx_ring and ice_rx_ring here,
but that could arguably be split to its own series. I only noticed it
because of attempting to move the prev_pkt field out of the ring stats. I
replaced the comments with cacheline_group, but I did not make an attempt
to optimize the existing cachelines. Probably we should experiment with the
method used in idpf with the 'read-mostly', 'read-write' and 'cold'
groupings, but doing so will require a more thorough deep dive on
performance profiling and tuning.
Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
---
Jacob Keller (9):
ice: initialize ring_stats->syncp
ice: use cacheline groups for ice_rx_ring structure
ice: use cacheline groups for ice_tx_ring structure
ice: move prev_pkt from ice_txq_stats to ice_tx_ring
ice: pass pointer to ice_fetch_u64_stats_per_ring
ice: remove ice_q_stats struct and use struct_group
ice: use u64_stats API to access pkts/bytes in dim sample
ice: shorten ring stat names and add accessors
ice: convert all ring stats to u64_stats_t
drivers/net/ethernet/intel/ice/ice.h | 3 -
drivers/net/ethernet/intel/ice/ice_lib.h | 6 +
drivers/net/ethernet/intel/ice/ice_txrx.h | 135 ++++++++++++-----
drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 2 +-
drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
drivers/net/ethernet/intel/ice/ice_ethtool.c | 30 ++--
drivers/net/ethernet/intel/ice/ice_lib.c | 61 ++++++--
drivers/net/ethernet/intel/ice/ice_main.c | 201 +++++++++++++++++---------
drivers/net/ethernet/intel/ice/ice_txrx.c | 45 +++---
drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +-
drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +-
11 files changed, 331 insertions(+), 162 deletions(-)
---
base-commit: 4601e382d0413867dbbb150d90e47352d7b0631e
change-id: 20251016-jk-refactor-queue-stats-9e721b34ce01
Best regards,
--
Jacob Keller <jacob.e.keller@...el.com>
Powered by blists - more mailing lists