[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1757237976-531416-3-git-send-email-tariqt@nvidia.com>
Date: Sun, 7 Sep 2025 12:39:36 +0300
From: Tariq Toukan <tariqt@...dia.com>
To: Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
	Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David
 S. Miller" <davem@...emloft.net>
CC: Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
	Tariq Toukan <tariqt@...dia.com>, Mark Bloch <mbloch@...dia.com>, "Jonathan
 Corbet" <corbet@....net>, Jiri Pirko <jiri@...nulli.us>,
	<netdev@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Gal Pressman
	<gal@...dia.com>, Dragos Tatulea <dtatulea@...dia.com>
Subject: [PATCH net-next 2/2] net/mlx5e: Add stale counter for PCIe congestion events
From: Dragos Tatulea <dtatulea@...dia.com>
This ethtool counter is meant to help with observing how many times the
congestion event was triggered but on query there was no state change.
This would help to indicate when a work item was scheduled to run too
late and in the meantime the congestion state changed back to previous
state.
While at it, do a driveby typo fix in documentation for
pci_bw_inbound_high.
Signed-off-by: Dragos Tatulea <dtatulea@...dia.com>
Signed-off-by: Tariq Toukan <tariqt@...dia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5/counters.rst     | 7 ++++++-
 .../net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c   | 7 ++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
index 754c81436408..cc498895f92e 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
@@ -1348,7 +1348,7 @@ Device Counters
        is in a congested state.
        If pci_bw_inbound_high == pci_bw_inbound_low then the device is not congested.
        If pci_bw_inbound_high > pci_bw_inbound_low then the device is congested.
-     - Tnformative
+     - Informative
 
    * - `pci_bw_inbound_low`
      - The number of times the device crossed the low inbound PCIe bandwidth
@@ -1373,3 +1373,8 @@ Device Counters
        If pci_bw_outbound_high == pci_bw_outbound_low then the device is not congested.
        If pci_bw_outbound_high > pci_bw_outbound_low then the device is congested.
      - Informative
+
+   * - `pci_bw_stale_event`
+     - The number of times the device fired a PCIe congestion event but on query
+       there was no change in state.
+     - Informative
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c
index 0cf142f71c09..2eb666a46f39 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c
@@ -24,6 +24,7 @@ struct mlx5e_pcie_cong_stats {
 	u32 pci_bw_inbound_low;
 	u32 pci_bw_outbound_high;
 	u32 pci_bw_outbound_low;
+	u32 pci_bw_stale_event;
 };
 
 struct mlx5e_pcie_cong_event {
@@ -52,6 +53,8 @@ static const struct counter_desc mlx5e_pcie_cong_stats_desc[] = {
 			     pci_bw_outbound_high) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_pcie_cong_stats,
 			     pci_bw_outbound_low) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_pcie_cong_stats,
+			     pci_bw_stale_event) },
 };
 
 #define NUM_PCIE_CONG_COUNTERS ARRAY_SIZE(mlx5e_pcie_cong_stats_desc)
@@ -212,8 +215,10 @@ static void mlx5e_pcie_cong_event_work(struct work_struct *work)
 	}
 
 	changes = cong_event->state ^ new_cong_state;
-	if (!changes)
+	if (!changes) {
+		cong_event->stats.pci_bw_stale_event++;
 		return;
+	}
 
 	cong_event->state = new_cong_state;
 
-- 
2.31.1
Powered by blists - more mailing lists
 
