[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241114021711.5691-1-laoar.shao@gmail.com>
Date: Thu, 14 Nov 2024 10:17:11 +0800
From: Yafang Shao <laoar.shao@...il.com>
To: ttoukan.linux@...il.com,
gal@...dia.com,
kuba@...nel.org,
saeedm@...dia.com,
tariqt@...dia.com,
leon@...nel.org
Cc: netdev@...r.kernel.org,
linux-rdma@...r.kernel.org,
Yafang Shao <laoar.shao@...il.com>
Subject: [PATCH v2 net-next] net/mlx5e: Report rx_discards_phy via rx_fifo_errors
We observed a high number of rx_discards_phy events on some servers when
running `ethtool -S`. However, this important counter is not currently
reflected in the /proc/net/dev statistics file, making it challenging to
monitor effectively.
Since rx_fifo_errors represents receive FIFO errors on this network
deivice, it makes sense to include rx_discards_phy in this counter to
enhance monitoring visibility. This change will help administrators track
these events more effectively through standard interfaces.
I’ve also reviewed the manual for ethtool counters on mlx5 [0], and it
appears that rx_discards_phy and rx_fifo_errors have the same meaning.
rx_discards_phy: The number of received packets dropped due to lack of
buffers on a physical port. If this counter is
increasing, it implies that the adapter is congested and
cannot absorb the traffic coming from the network.
ConnectX-3 naming : rx_fifo_errors
The documentation in if_link.h has been updated accordingly.
Link: https://enterprise-support.nvidia.com/s/article/understanding-mlx5-ethtool-counters [0]
Suggested-by: Tariq Toukan <ttoukan.linux@...il.com>
Signed-off-by: Yafang Shao <laoar.shao@...il.com>
Cc: Tariq Toukan <ttoukan.linux@...il.com>
Cc: Saeed Mahameed <saeedm@...dia.com>
Cc: Leon Romanovsky <leon@...nel.org>
Cc: Gal Pressman <gal@...dia.com>
Cc: Jakub Kicinski <kuba@...nel.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1 +
include/uapi/linux/if_link.h | 4 ----
2 files changed, 1 insertion(+), 4 deletions(-)
Changes:
v1->v2:
- Use rx_fifo_errors instead (Tariq)
- Update the if_link.h accordingly
v1: https://lore.kernel.org/netdev/20241106064015.4118-1-laoar.shao@gmail.com/
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e601324a690a..15b1a3e6e641 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3916,6 +3916,7 @@ mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
}
stats->rx_missed_errors = priv->stats.qcnt.rx_out_of_buffer;
+ stats->rx_fifo_errors = PPORT_2863_GET(pstats, if_in_discards);
stats->rx_length_errors =
PPORT_802_3_GET(pstats, a_in_range_length_errors) +
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 6dc258993b17..16dfaf5f47ca 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -144,10 +144,6 @@ struct rtnl_link_stats {
* not correspond one-to-one with dropped packets.
*
* This statistics was used interchangeably with @rx_over_errors.
- * Not recommended for use in drivers for high speed interfaces.
- *
- * This statistic is used on software devices, e.g. to count software
- * packet queue overflow (can) or sequencing errors (GRE).
*
* @rx_missed_errors: Count of packets missed by the host.
* Folded into the "drop" counter in `/proc/net/dev`.
--
2.43.5
Powered by blists - more mailing lists