netdev - RE: [PATCH v2 net-next 07/11] net: ena: Add more information on TX timeouts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <903a717258b548e19314b35e1ff9b638@amazon.com>
Date: Thu, 1 Feb 2024 12:53:41 +0000
From: "Arinzon, David" <darinzon@...zon.com>
To: Simon Horman <horms@...nel.org>
CC: "Nelson, Shannon" <shannon.nelson@....com>, David Miller
	<davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Woodhouse, David"
	<dwmw@...zon.co.uk>, "Machulsky, Zorik" <zorik@...zon.com>, "Matushevsky,
 Alexander" <matua@...zon.com>, "Bshara, Saeed" <saeedb@...zon.com>, "Wilson,
 Matt" <msw@...zon.com>, "Liguori, Anthony" <aliguori@...zon.com>, "Bshara,
 Nafea" <nafea@...zon.com>, "Belgazal, Netanel" <netanel@...zon.com>, "Saidi,
 Ali" <alisaidi@...zon.com>, "Herrenschmidt, Benjamin" <benh@...zon.com>,
	"Kiyanovski, Arthur" <akiyano@...zon.com>, "Dagan, Noam" <ndagan@...zon.com>,
	"Agroskin, Shay" <shayagr@...zon.com>, "Itzko, Shahar" <itzko@...zon.com>,
	"Abboud, Osama" <osamaabb@...zon.com>, "Ostrovsky, Evgeny"
	<evostrov@...zon.com>, "Tabachnik, Ofir" <ofirt@...zon.com>, "Koler, Nati"
	<nkoler@...zon.com>
Subject: RE: [PATCH v2 net-next 07/11] net: ena: Add more information on TX timeouts

> On Tue, Jan 30, 2024 at 09:53:49AM +0000, darinzon@...zon.com wrote:
> > From: David Arinzon <darinzon@...zon.com>
> >
> > The function responsible for polling TX completions might not receive
> > the CPU resources it needs due to higher priority tasks running on the
> > requested core.
> >
> > The driver might not be able to recognize such cases, but it can use
> > its state to suspect that they happened. If both conditions are met:
> >
> > - napi hasn't been executed more than the TX completion timeout value
> > - napi is scheduled (meaning that we've received an interrupt)
> >
> > Then it's more likely that the napi handler isn't scheduled because of
> > an overloaded CPU.
> > It was decided that for this case, the driver would wait twice as long
> > as the regular timeout before scheduling a reset.
> > The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset
> reason
> > to indicate this case to the device.
> >
> > This patch also adds more information to the ena_tx_timeout() callback.
> > This function is called by the kernel when it detects that a specific
> > TX queue has been closed for too long.
> >
> > Signed-off-by: Shay Agroskin <shayagr@...zon.com>
> > Signed-off-by: David Arinzon <darinzon@...zon.com>
> > ---
> >  drivers/net/ethernet/amazon/ena/ena_netdev.c  | 77
> +++++++++++++++----
> >  .../net/ethernet/amazon/ena/ena_regs_defs.h   |  1 +
> >  2 files changed, 64 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> > b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> > index 18acb76..ae9291b 100644
> > --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> > +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> > @@ -47,19 +47,44 @@ static int ena_restore_device(struct ena_adapter
> > *adapter);
> >
> >  static void ena_tx_timeout(struct net_device *dev, unsigned int
> > txqueue)  {
> > +     enum ena_regs_reset_reason_types reset_reason =
> > + ENA_REGS_RESET_OS_NETDEV_WD;
> >       struct ena_adapter *adapter = netdev_priv(dev);
> > +     unsigned int time_since_last_napi, threshold;
> > +     struct ena_ring *tx_ring;
> > +     int napi_scheduled;
> > +
> > +     if (txqueue >= adapter->num_io_queues) {
> > +             netdev_err(dev, "TX timeout on invalid queue %u\n", txqueue);
> > +             goto schedule_reset;
> > +     }
> > +
> > +     threshold = jiffies_to_usecs(dev->watchdog_timeo);
> > +     tx_ring = &adapter->tx_ring[txqueue];
> > +
> > +     time_since_last_napi = jiffies_to_usecs(jiffies - tx_ring-
> >tx_stats.last_napi_jiffies);
> > +     napi_scheduled = !!(tx_ring->napi->state & NAPIF_STATE_SCHED);
> >
> > +     netdev_err(dev,
> > +                "TX q %d is paused for too long (threshold %u). Time since last
> napi %u usec. napi scheduled: %d\n",
> > +                txqueue,
> > +                threshold,
> > +                time_since_last_napi,
> > +                napi_scheduled);
> > +
> > +     if (threshold < time_since_last_napi && napi_scheduled) {
> > +             netdev_err(dev,
> > +                        "napi handler hasn't been called for a long time but is
> scheduled\n");
> > +                        reset_reason =
> > + ENA_REGS_RESET_SUSPECTED_POLL_STARVATION;
> 
> Hi David,
> 
> a nit from my side: the line above is indented one tab-stop too many.
> No need to respin just for this AFAIC.
> 

Hi Simon,

Thanks for pointing it out. Seems like I got carried away a bit with the
Indentation due to the print above it.

Thanks,
David

> > +     }
> > +schedule_reset:
> >       /* Change the state of the device to trigger reset
> >        * Check that we are not in the middle or a trigger already
> >        */
> > -
> >       if (test_and_set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
> >               return;
> >
> > -     ena_reset_device(adapter, ENA_REGS_RESET_OS_NETDEV_WD);
> > +     ena_reset_device(adapter, reset_reason);
> >       ena_increase_stat(&adapter->dev_stats.tx_timeout, 1,
> > &adapter->syncp);
> > -
> > -     netif_err(adapter, tx_err, dev, "Transmit time out\n");
> >  }
> >
> 
> ...