lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E1407A1.5080809@parallels.com>
Date:	Wed, 6 Jul 2011 10:58:41 +0400
From:	Konstantin Khlebnikov <khlebnikov@...allels.com>
To:	Vivek Goyal <vgoyal@...hat.com>
CC:	Jens Axboe <axboe@...nel.dk>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC 1/2] cfq: request-deadline policy

Vivek Goyal wrote:
> On Mon, Jul 04, 2011 at 05:08:38PM +0400, Konstantin Khlebnikov wrote:
>> CFQ is designed for sharing disk bandwidth proportionally between queues and groups
>> and for reordering requests to reduce disks seek time. Currently it cannot
>> gurantee or estimate latency for individual requests, even if latencies are low
>> for almost all requests, some of them can stuck inside scheduler for a long time.
>> The fair policy is good as long as someone luckless begins to die due to a timeout.
>>
>> This patch implements fifo requests dispatching with deadline policy: now cfq
>> obliged to dispatch request if it stuck in the queue for more than deadline.
>>
>> This way now cfq can try to ensure the expected latency of requests execution.
>> It is like a safety valve, it should not work all time, but it should keep latency
>> in sane range when the scheduler is unable to effectively handle flow of requests,
>> especially in cases when the "noop" or "deadline" shows better performance.
>>
>> deadline can be tuned via /sys/block/<device>/queue/iosched/deadline_{sync,async}
>> it by default 2000ms for sync and 4000ms for async requests, use 0 to disable it.
>
> What's the workload where you are running into issues with existing
> policy?

This is huge internal test workload,
there >100 containers with mail/http/ftp and something more.

>
> We have low_latency=1 by default and which tries to schedule every
> queue once in 300ms atleast. And with-in queue we already have the
> notion of looking at fifo and dispatch the expired request first.

Without this patch some requests stuck in the scheduler for more than 30 seconds,
and it looks like it is no limit.

With this patch max-wait-time (from the second patch) shows 7 seconds for this workload,
so of course queue is over-congested, but it continues to work predictably.

>
> So to me sync queue scheduling shold be pretty good. Async queues
> can get starved though. With-in sync queue, if some requests have
> expired, it is probably because of the fact that disk is slow and
> we are throwing too much IO at it. So if we start always dispatching
> expired requests first, then the notion of fairness is out of the
> window.
>
> Why not use deadline scheduler for your case?

Because the scheduler must be universal, load can be arbitrary and constantly changing,
we also can not modify each machine separately.

>
> Thanks
> Vivek

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ