lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240930201307.330692-3-mschmidt@redhat.com>
Date: Mon, 30 Sep 2024 22:13:05 +0200
From: Michal Schmidt <mschmidt@...hat.com>
To: Manish Chopra <manishc@...vell.com>,
	netdev@...r.kernel.org
Cc: Caleb Sander <csander@...estorage.com>,
	Alok Prasad <palok@...vell.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>,
	Paolo Abeni <pabeni@...hat.com>
Subject: [PATCH net-next 2/4] qed: put cond_resched() in qed_grc_dump_ctx_data()

On a kernel with preemption none or voluntary, 'ethtool -d'
on a qede network device can cause a big latency spike.
The biggest part of it is the loop in qed_grc_dump_ctx_data.

The function is called only from the .get_size and .perform_dump
callbacks for the "grc" feature defined in qed_features_lookup[].
As far as I can see, they are used in:
 - qed's devlink healh reporter .dump op
 - qede's ethtool get_regs/get_regs_len/get_dump_data ops
 - qedf's qedf_get_grc_dump, called from:
   - qedf_sysfs_write_grcdump - "grcdump" sysfs attribute write
   - qedf_wq_grcdump - a workqueue

It is safe to sleep in all of them.
Let's insert a cond_resched() in the outer loop to let other tasks run.

Measured using this script:

  #!/bin/bash
  DEV=ens3f1
  echo wakeup_rt > /sys/kernel/tracing/current_tracer
  echo 0 > /sys/kernel/tracing/tracing_max_latency
  echo 1 > /sys/kernel/tracing/tracing_on
  echo "Setting the task CPU affinity"
  taskset -p 1 $$ > /dev/null
  echo "Starting the real-time task"
  chrt -f 50 bash -c 'while sleep 0.01; do :; done' &
  sleep 1
  echo "Running: ethtool -d $DEV"
  time ethtool -d $DEV > /dev/null
  kill %1
  echo 0 > /sys/kernel/tracing/tracing_on
  echo "Measured latency: $(</sys/kernel/tracing/tracing_max_latency) us"
  echo "To see the latency trace: less /sys/kernel/tracing/trace"

The patch lowers the latency from 180 ms to 53 ms on my test system with
voluntary preemption.

Signed-off-by: Michal Schmidt <mschmidt@...hat.com>
---
 drivers/net/ethernet/qlogic/qed/qed_debug.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c b/drivers/net/ethernet/qlogic/qed/qed_debug.c
index f67be4b8ad43..464a72afb758 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_debug.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c
@@ -2873,6 +2873,7 @@ static u32 qed_grc_dump_ctx_data(struct qed_hwfn *p_hwfn,
 							  false,
 							  SPLIT_TYPE_NONE, 0);
 		}
+		cond_resched();
 	}
 
 	return offset;
-- 
2.46.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ