linux-kernel - Re: [RFC] IO scheduler based IO controller V9

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4AAE53AD.1080206@redhat.com>
Date:	Mon, 14 Sep 2009 16:31:09 +0200
From:	Jerome Marchand <jmarchan@...hat.com>
To:	Vivek Goyal <vgoyal@...hat.com>
CC:	linux-kernel@...r.kernel.org, jens.axboe@...cle.com,
	containers@...ts.linux-foundation.org, dm-devel@...hat.com,
	nauman@...gle.com, dpshah@...gle.com, lizf@...fujitsu.com,
	mikew@...gle.com, fchecconi@...il.com, paolo.valente@...more.it,
	ryov@...inux.co.jp, fernando@....ntt.co.jp, s-uchida@...jp.nec.com,
	taka@...inux.co.jp, guijianfeng@...fujitsu.com, jmoyer@...hat.com,
	dhaval@...ux.vnet.ibm.com, balbir@...ux.vnet.ibm.com,
	righi.andrea@...il.com, m-ikeda@...jp.nec.com, agk@...hat.com,
	akpm@...ux-foundation.org, peterz@...radead.org,
	torvalds@...ux-foundation.org, mingo@...e.hu, riel@...hat.com
Subject: Re: [RFC] IO scheduler based IO controller V9

Vivek Goyal wrote:
> On Thu, Sep 10, 2009 at 05:18:25PM +0200, Jerome Marchand wrote:
>> Vivek Goyal wrote:
>>> Hi All,
>>>
>>> Here is the V9 of the IO controller patches generated on top of 2.6.31-rc7.
>>  
>> Hi Vivek,
>>
>> I've run some postgresql benchmarks for io-controller. Tests have been
>> made with 2.6.31-rc6 kernel, without io-controller patches (when
>> relevant) and with io-controller v8 and v9 patches.
>> I set up two instances of the TPC-H database, each running in their
>> own io-cgroup. I ran two clients to these databases and tested on each
>> that simple request:
>> $ select count(*) from LINEITEM;
>> where LINEITEM is the biggest table of TPC-H (6001215 entries,
>> 720MB). That request generates a steady stream of IOs.
>>
>> Time is measure by psql (\timing switched on). Each test is run twice
>> or more if there is any significant difference between the first two
>> runs. Before each run, the cache is flush:
>> $ echo 3 > /proc/sys/vm/drop_caches
>>
>>
>> Results with 2 groups of same io policy (BE) and same io weight (1000):
>>
>> 	w/o io-scheduler	io-scheduler v8		io-scheduler v9
>> 	first	second		first	second		first	second
>> 	DB	DB		DB	DB		DB	DB
>>
>> CFQ	48.4s	48.4s		48.2s	48.2s		48.1s	48.5s
>> Noop	138.0s	138.0s		48.3s	48.4s		48.5s	48.8s
>> AS	46.3s	47.0s		48.5s	48.7s		48.3s	48.5s
>> Deadl.	137.1s	137.1s		48.2s	48.3s		48.3s	48.5s
>>
>> As you can see, there is no significant difference for CFQ
>> scheduler. There is big improvement for noop and deadline schedulers
>> (why is that happening?). The performance with anticipatory scheduler
>> is a bit lower (~4%).
>>
> 
> Ok, I think what's happening here is that by default slice lenght for
> a queue is 100ms. When you put two instances of DB in two different
> groups, one streaming reader can run at max for 100ms at a go and then 
> we switch to next reader.
> 
> But when both the readers are in root group, then AS lets run one reader
> to run at max 250ms (sometimes 125ms and sometimes 250ms based on at what
> time as_fifo_expired() was invoked).
> 
> So because a reader gets to run longer at one stretch in root group, it
> reduces number of seeks and leads to little enhanced throughput.
> 
> If you change the /sys/block/<disk>/queue/iosched/slice_sync to 250 ms, then
> one group queue can run at max for 250ms before we switch the queue. In
> this case you should be able to get same performance as in root group.
> 
> Thanks
> Vivek

Indeed. When I run the benchmark with slice_sync = 250ms, I get results
close to the one for both instances running within the root group:
first group 46.1s and second group 46.4s.

Jerome


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/