[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221222221244.1290833-3-kuba@kernel.org>
Date: Thu, 22 Dec 2022 14:12:43 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: peterz@...radead.org, tglx@...utronix.de
Cc: jstultz@...gle.com, edumazet@...gle.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>
Subject: [PATCH 2/3] softirq: avoid spurious stalls due to need_resched()
need_resched() added in commit c10d73671ad3 ("softirq: reduce latencies")
does improve latency for real workloads (for example memcache).
Unfortunately it triggers quite often even for non-network-heavy apps
(~900 times a second on a loaded webserver), and in small fraction of
cases whatever the scheduler decided to run will hold onto the CPU
for the entire time slice.
10ms+ stalls on a machine which is not actually under overload cause
erratic network behavior and spurious TCP retransmits. Typical end-to-end
latency in a datacenter is < 200us so its common to set TCP timeout
to 10ms or less.
The intent of the need_resched() is to let a low latency application
respond quickly and yield (to ksoftirqd). Put a time limit on this dance.
Ignore the fact that ksoftirqd is RUNNING if we were trying to be nice
and the application did not yield quickly.
On a webserver loaded at 90% CPU this change reduces the numer of 8ms+
stalls the network softirq processing sees by around 10x (2/sec -> 0.2/sec).
It also seems to reduce retransmissions by ~10% but the data is quite
noisy.
Signed-off-by: Jakub Kicinski <kuba@...nel.org>
---
kernel/softirq.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 00b838d566c1..ad200d386ec1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -59,6 +59,7 @@ EXPORT_PER_CPU_SYMBOL(irq_stat);
static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+static DEFINE_PER_CPU(unsigned long, overload_limit);
const char * const softirq_to_name[NR_SOFTIRQS] = {
"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL",
@@ -89,10 +90,15 @@ static void wakeup_softirqd(void)
static bool ksoftirqd_should_handle(unsigned long pending)
{
struct task_struct *tsk = __this_cpu_read(ksoftirqd);
+ unsigned long ov_limit;
if (pending & SOFTIRQ_NOW_MASK)
return false;
- return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
+ if (likely(!tsk || !task_is_running(tsk) || __kthread_should_park(tsk)))
+ return false;
+
+ ov_limit = __this_cpu_read(overload_limit);
+ return time_is_after_jiffies(ov_limit);
}
#ifdef CONFIG_TRACE_IRQFLAGS
@@ -492,6 +498,9 @@ asmlinkage __visible void do_softirq(void)
#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
#define MAX_SOFTIRQ_RESTART 10
+#define SOFTIRQ_OVERLOAD_TIME msecs_to_jiffies(100)
+#define SOFTIRQ_DEFER_TIME msecs_to_jiffies(2)
+
#ifdef CONFIG_TRACE_IRQFLAGS
/*
* When we run softirqs from irq_exit() and thus on the hardirq stack we need
@@ -588,10 +597,16 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
pending = local_softirq_pending();
if (pending) {
- if (time_before(jiffies, end) && !need_resched() &&
- --max_restart)
+ unsigned long limit;
+
+ if (time_is_before_eq_jiffies(end) || !--max_restart)
+ limit = SOFTIRQ_OVERLOAD_TIME;
+ else if (need_resched())
+ limit = SOFTIRQ_DEFER_TIME;
+ else
goto restart;
+ __this_cpu_write(overload_limit, jiffies + limit);
wakeup_softirqd();
}
--
2.38.1
Powered by blists - more mailing lists