lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 22 Dec 2022 14:12:43 -0800 From: Jakub Kicinski <kuba@...nel.org> To: peterz@...radead.org, tglx@...utronix.de Cc: jstultz@...gle.com, edumazet@...gle.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, Jakub Kicinski <kuba@...nel.org> Subject: [PATCH 2/3] softirq: avoid spurious stalls due to need_resched() need_resched() added in commit c10d73671ad3 ("softirq: reduce latencies") does improve latency for real workloads (for example memcache). Unfortunately it triggers quite often even for non-network-heavy apps (~900 times a second on a loaded webserver), and in small fraction of cases whatever the scheduler decided to run will hold onto the CPU for the entire time slice. 10ms+ stalls on a machine which is not actually under overload cause erratic network behavior and spurious TCP retransmits. Typical end-to-end latency in a datacenter is < 200us so its common to set TCP timeout to 10ms or less. The intent of the need_resched() is to let a low latency application respond quickly and yield (to ksoftirqd). Put a time limit on this dance. Ignore the fact that ksoftirqd is RUNNING if we were trying to be nice and the application did not yield quickly. On a webserver loaded at 90% CPU this change reduces the numer of 8ms+ stalls the network softirq processing sees by around 10x (2/sec -> 0.2/sec). It also seems to reduce retransmissions by ~10% but the data is quite noisy. Signed-off-by: Jakub Kicinski <kuba@...nel.org> --- kernel/softirq.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index 00b838d566c1..ad200d386ec1 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -59,6 +59,7 @@ EXPORT_PER_CPU_SYMBOL(irq_stat); static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp; DEFINE_PER_CPU(struct task_struct *, ksoftirqd); +static DEFINE_PER_CPU(unsigned long, overload_limit); const char * const softirq_to_name[NR_SOFTIRQS] = { "HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL", @@ -89,10 +90,15 @@ static void wakeup_softirqd(void) static bool ksoftirqd_should_handle(unsigned long pending) { struct task_struct *tsk = __this_cpu_read(ksoftirqd); + unsigned long ov_limit; if (pending & SOFTIRQ_NOW_MASK) return false; - return tsk && task_is_running(tsk) && !__kthread_should_park(tsk); + if (likely(!tsk || !task_is_running(tsk) || __kthread_should_park(tsk))) + return false; + + ov_limit = __this_cpu_read(overload_limit); + return time_is_after_jiffies(ov_limit); } #ifdef CONFIG_TRACE_IRQFLAGS @@ -492,6 +498,9 @@ asmlinkage __visible void do_softirq(void) #define MAX_SOFTIRQ_TIME msecs_to_jiffies(2) #define MAX_SOFTIRQ_RESTART 10 +#define SOFTIRQ_OVERLOAD_TIME msecs_to_jiffies(100) +#define SOFTIRQ_DEFER_TIME msecs_to_jiffies(2) + #ifdef CONFIG_TRACE_IRQFLAGS /* * When we run softirqs from irq_exit() and thus on the hardirq stack we need @@ -588,10 +597,16 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) pending = local_softirq_pending(); if (pending) { - if (time_before(jiffies, end) && !need_resched() && - --max_restart) + unsigned long limit; + + if (time_is_before_eq_jiffies(end) || !--max_restart) + limit = SOFTIRQ_OVERLOAD_TIME; + else if (need_resched()) + limit = SOFTIRQ_DEFER_TIME; + else goto restart; + __this_cpu_write(overload_limit, jiffies + limit); wakeup_softirqd(); } -- 2.38.1
Powered by blists - more mailing lists