lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 20 Jun 2011 10:16:32 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	linux kernel mailing list <linux-kernel@...r.kernel.org>,
	Jens Axboe <axboe@...nel.dk>
Cc:	Tao Ma <tm@....ma>
Subject: [PATCH] cfq: Fix starvation of async writes in presence of heavy
 sync workload

In presence of heavy sync workload CFQ can starve asnc writes.
If one launches multiple readers (say 16), then one can notice
that CFQ can withhold dispatch of WRITEs for a very long time say
200 or 300 seconds.

Basically CFQ schedules an async queue but does not dispatch any
writes because it is waiting for exisintng sync requests in queue to
finish. While it is waiting, one or other reader gets queued up and
preempts the async queue. So we did schedule the async queue but never
dispatched anything from it. This can repeat for long time hence
practically starving Writers.

This patch allows async queue to dispatch atleast 1 requeust once
it gets scheduled and denies preemption if async queue has been
waiting for sync requests to drain and has not been able to dispatch
a request yet.

One concern with this fix is that how does it impact readers
in presence of heavy writting going on.

I did a test where I launch firefox, load a website and close
firefox and measure the time. I ran the test 3 times and took
average.

- Vanilla kernel time ~= 1 minute 40 seconds
- Patched kenrel time ~= 1 minute 35 seconds

Basically it looks like that for this test times have not
changed much for this test. But I would not claim that it does
not impact reader's latencies at all. It might show up in
other workloads.

I think we anyway need to fix writer starvation. If this patch
causes issues, then we need to look at reducing writer's
queue depth further to improve latencies for readers.

Reported-and-Tested-by: Tao Ma <tm@....ma>
Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
---
 block/cfq-iosched.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6/block/cfq-iosched.c
===================================================================
--- linux-2.6.orig/block/cfq-iosched.c	2011-06-10 10:05:34.660781278 -0400
+++ linux-2.6/block/cfq-iosched.c	2011-06-20 08:29:13.328186380 -0400
@@ -3315,8 +3315,15 @@ cfq_should_preempt(struct cfq_data *cfqd
 	 * if the new request is sync, but the currently running queue is
 	 * not, let the sync request have priority.
 	 */
-	if (rq_is_sync(rq) && !cfq_cfqq_sync(cfqq))
+	if (rq_is_sync(rq) && !cfq_cfqq_sync(cfqq)) {
+		/*
+		 * Allow atleast one dispatch otherwise this can repeat
+		 * and writes can be starved completely
+		 */
+		if (!cfqq->slice_dispatch)
+			return false;
 		return true;
+	}
 
 	if (new_cfqq->cfqg != cfqq->cfqg)
 		return false;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ