lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad30ef4edfc22e7f29280fdec763189efddda9d0.camel@gmx.de>
Date: Thu, 31 Oct 2024 11:21:45 +0100
From: Mike Galbraith <efault@....de>
To: LKML <linux-kernel@...r.kernel.org>
Cc: Ryan Roberts <ryan.roberts@....com>
Subject: RFC [PATCH2] fs/writeback: Mitigate move_expired_inodes() induced
 service latency

(was regression: mm: vmscan:  -  size XL irqoff time increase v6.10+)


Break off queueing of IO after we've been at it for a ms or so and a
preemption is due, to keep writeback latency impact at least reasonable.
The IO we're queueing under spinlock still has to be started under that
same lock.

wakeup_rt tracing caught this function spanning 66ms in a i7-4790 box.

With this patch applied on top of one to mitigate even worse IRQ holdoff
induced hits (78ms) by isolate_lru_folios(), the same trivial load that
leads to this and worse (osc kernel package build + bonnie):
T: 1 ( 6211) P:99 I:1500 C: 639971 Min:      1 Act:    7 Avg:   12 Max:   66696

resulted in this perfectly reasonable max:
T: 0 ( 6078) P:99 I:1000 C:1031230 Min:      1 Act:    7 Avg:    4 Max:    4449

Note: cyclictest -Smp99 is only the messenger.  This is not an RT issue,
RT is fingering bad generic behavior.

Signed-off-by: Mike Galbraith <efault@....de>
---
 fs/fs-writeback.c |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -29,6 +29,7 @@
 #include <linux/tracepoint.h>
 #include <linux/device.h>
 #include <linux/memcontrol.h>
+#include <linux/sched/clock.h>
 #include "internal.h"

 /*
@@ -1424,6 +1425,10 @@ static int move_expired_inodes(struct li
 	struct inode *inode;
 	int do_sb_sort = 0;
 	int moved = 0;
+#ifndef CONFIG_PREEMPT_RT
+	u64 then = local_clock();
+	int iter = 0;
+#endif

 	while (!list_empty(delaying_queue)) {
 		inode = wb_inode(delaying_queue->prev);
@@ -1439,6 +1444,19 @@ static int move_expired_inodes(struct li
 		if (sb && sb != inode->i_sb)
 			do_sb_sort = 1;
 		sb = inode->i_sb;
+#ifndef CONFIG_PREEMPT_RT
+		/*
+		 * We're under ->list_lock here, and the IO being queued
+		 * still has to be started. Stop queueing when we've been
+		 * at it for a ms or so and a preemption is due, to keep
+		 * latency impact reasonable.
+		 */
+		if (iter++ < 100 || !need_resched())
+			continue;
+		if (local_clock() - then > NSEC_PER_MSEC)
+			break;
+		iter = 0;
+#endif
 	}

 	/* just one sb in list, splice to dispatch_queue and we're done */


Download attachment "wakeup_rt-trace-6.12.0.g4236f913-master.gz" of type "application/gzip" (967956 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ