lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YfQMQWsFqCIPBBqO@boxer>
Date:   Fri, 28 Jan 2022 16:31:13 +0100
From:   Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To:     "Maurice Baijens (Ellips B.V.)" <maurice.baijens@...ips.com>
Cc:     "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: ixgbe driver link down causes 100% load in ksoftirqd/x

On Thu, Jan 20, 2022 at 09:23:06AM +0000, Maurice Baijens (Ellips B.V.) wrote:
> Hello,
> 
> 
> I have an issue with the ixgbe driver and X550Tx network adapter.
> When I disconnect the network cable I end up with 100% load in ksoftirqd/x. I am running the adapter in
> xdp mode (XDP_FLAGS_DRV_MODE). Problem seen in linux kernel 5.15.x and also 5.16.0+ (head).

Hello,

a stupid question - why do you disconnect the cable when running traffic? :)
If you plug this back in then what happens?

> 
> I traced the problem down to function ixgbe_xmit_zc in ixgbe_xsk.c:
> 
> if (unlikely(!ixgbe_desc_unused(xdp_ring)) ||
>     !netif_carrier_ok(xdp_ring->netdev)) {
>             work_done = false;
>             break;
> }

This was done in commit c685c69fba71 ("ixgbe: don't do any AF_XDP
zero-copy transmit if netif is not OK") - it was addressing the transient
state when configuring the xsk pool on particular queue pair.

> 
> This function is called from ixgbe_poll() function via ixgbe_clean_xdp_tx_irq(). It sets
> work_done to false if netif_carrier_ok() returns false (so if link is down). Because work_done
> is always false, ixgbe_poll keeps on polling forever.
> 
> I made a fix by checking link in ixgbe_poll() function and if no link exiting polling mode:
> 
> /* If all work not completed, return budget and keep polling */
> if ((!clean_complete) && netif_carrier_ok(adapter->netdev))
>             return budget;

Not sure about the correctness of this. Question is how should we act for
link down - should we say that we are done with processing or should we
wait until the link gets back?

Instead of setting the work_done to false immediately for
!netif_carrier_ok(), I'd rather break out the checks that are currently
combined into the single statement, something like this:

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index b3fd8e5cd85b..6a5e9cf6b5da 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -390,12 +390,14 @@ static bool ixgbe_xmit_zc(struct ixgbe_ring *xdp_ring, unsigned int budget)
 	u32 cmd_type;
 
 	while (budget-- > 0) {
-		if (unlikely(!ixgbe_desc_unused(xdp_ring)) ||
-		    !netif_carrier_ok(xdp_ring->netdev)) {
+		if (unlikely(!ixgbe_desc_unused(xdp_ring))) {
 			work_done = false;
 			break;
 		}
 
+		if (!netif_carrier_ok(xdp_ring->netdev))
+			break;
+
 		if (!xsk_tx_peek_desc(pool, &desc))
 			break;


> 
> This is probably fine for our application as we only run in xdpdrv mode, however I am not sure this

By xdpdrv I would understand that you're running XDP in standard native
mode, however you refer to the AF_XDP Zero Copy implementation in the
driver. But I don't think it changes anything in this thread.

In the end I see some outstanding issues with ixgbe_xmit_zc(), so this
probably might need some attention.

Thanks!
Maciej

> is the correct way to fix this issue and the behaviour of the normal skb mode operation is 
> also affected by my fix.
> 
> So hopefully my observations are correct and someone here can fix the issue and push it upstream.
> 
> 
> Best regards,
> 	Maurice Baijens

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ