netdev - Re: [PATCH v2 net-next 07/11] net: ena: Add more information on TX timeouts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240201122705.GA530335@kernel.org>
Date: Thu, 1 Feb 2024 13:27:05 +0100
From: Simon Horman <horms@...nel.org>
To: darinzon@...zon.com
Cc: "Nelson, Shannon" <shannon.nelson@....com>,
	David Miller <davem@...emloft.net>,
	Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
	"Woodhouse, David" <dwmw@...zon.com>,
	"Machulsky, Zorik" <zorik@...zon.com>,
	"Matushevsky, Alexander" <matua@...zon.com>,
	Saeed Bshara <saeedb@...zon.com>, "Wilson, Matt" <msw@...zon.com>,
	"Liguori, Anthony" <aliguori@...zon.com>,
	"Bshara, Nafea" <nafea@...zon.com>,
	"Belgazal, Netanel" <netanel@...zon.com>,
	"Saidi, Ali" <alisaidi@...zon.com>,
	"Herrenschmidt, Benjamin" <benh@...zon.com>,
	"Kiyanovski, Arthur" <akiyano@...zon.com>,
	"Dagan, Noam" <ndagan@...zon.com>,
	"Agroskin, Shay" <shayagr@...zon.com>,
	"Itzko, Shahar" <itzko@...zon.com>,
	"Abboud, Osama" <osamaabb@...zon.com>,
	"Ostrovsky, Evgeny" <evostrov@...zon.com>,
	"Tabachnik, Ofir" <ofirt@...zon.com>,
	"Koler, Nati" <nkoler@...zon.com>
Subject: Re: [PATCH v2 net-next 07/11] net: ena: Add more information on TX
 timeouts

On Tue, Jan 30, 2024 at 09:53:49AM +0000, darinzon@...zon.com wrote:
> From: David Arinzon <darinzon@...zon.com>
> 
> The function responsible for polling TX completions might not receive
> the CPU resources it needs due to higher priority tasks running on the
> requested core.
> 
> The driver might not be able to recognize such cases, but it can use its
> state to suspect that they happened. If both conditions are met:
> 
> - napi hasn't been executed more than the TX completion timeout value
> - napi is scheduled (meaning that we've received an interrupt)
> 
> Then it's more likely that the napi handler isn't scheduled because of
> an overloaded CPU.
> It was decided that for this case, the driver would wait twice as long
> as the regular timeout before scheduling a reset.
> The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset reason to
> indicate this case to the device.
> 
> This patch also adds more information to the ena_tx_timeout() callback.
> This function is called by the kernel when it detects that a specific TX
> queue has been closed for too long.
> 
> Signed-off-by: Shay Agroskin <shayagr@...zon.com>
> Signed-off-by: David Arinzon <darinzon@...zon.com>
> ---
>  drivers/net/ethernet/amazon/ena/ena_netdev.c  | 77 +++++++++++++++----
>  .../net/ethernet/amazon/ena/ena_regs_defs.h   |  1 +
>  2 files changed, 64 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index 18acb76..ae9291b 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -47,19 +47,44 @@ static int ena_restore_device(struct ena_adapter *adapter);
>  
>  static void ena_tx_timeout(struct net_device *dev, unsigned int txqueue)
>  {
> +	enum ena_regs_reset_reason_types reset_reason = ENA_REGS_RESET_OS_NETDEV_WD;
>  	struct ena_adapter *adapter = netdev_priv(dev);
> +	unsigned int time_since_last_napi, threshold;
> +	struct ena_ring *tx_ring;
> +	int napi_scheduled;
> +
> +	if (txqueue >= adapter->num_io_queues) {
> +		netdev_err(dev, "TX timeout on invalid queue %u\n", txqueue);
> +		goto schedule_reset;
> +	}
> +
> +	threshold = jiffies_to_usecs(dev->watchdog_timeo);
> +	tx_ring = &adapter->tx_ring[txqueue];
> +
> +	time_since_last_napi = jiffies_to_usecs(jiffies - tx_ring->tx_stats.last_napi_jiffies);
> +	napi_scheduled = !!(tx_ring->napi->state & NAPIF_STATE_SCHED);
>  
> +	netdev_err(dev,
> +		   "TX q %d is paused for too long (threshold %u). Time since last napi %u usec. napi scheduled: %d\n",
> +		   txqueue,
> +		   threshold,
> +		   time_since_last_napi,
> +		   napi_scheduled);
> +
> +	if (threshold < time_since_last_napi && napi_scheduled) {
> +		netdev_err(dev,
> +			   "napi handler hasn't been called for a long time but is scheduled\n");
> +			   reset_reason = ENA_REGS_RESET_SUSPECTED_POLL_STARVATION;

Hi David,

a nit from my side: the line above is indented one tab-stop too many.
No need to respin just for this AFAIC.

> +	}
> +schedule_reset:
>  	/* Change the state of the device to trigger reset
>  	 * Check that we are not in the middle or a trigger already
>  	 */
> -
>  	if (test_and_set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
>  		return;
>  
> -	ena_reset_device(adapter, ENA_REGS_RESET_OS_NETDEV_WD);
> +	ena_reset_device(adapter, reset_reason);
>  	ena_increase_stat(&adapter->dev_stats.tx_timeout, 1, &adapter->syncp);
> -
> -	netif_err(adapter, tx_err, dev, "Transmit time out\n");
>  }
>  

...