linux-kernel - Re: [RFC] IO scheduler based IO controller V9

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AAA6255.70807@redhat.com>
Date:	Fri, 11 Sep 2009 16:44:37 +0200
From:	Jerome Marchand <jmarchan@...hat.com>
To:	Vivek Goyal <vgoyal@...hat.com>
CC:	linux-kernel@...r.kernel.org, jens.axboe@...cle.com,
	containers@...ts.linux-foundation.org, dm-devel@...hat.com,
	nauman@...gle.com, dpshah@...gle.com, lizf@...fujitsu.com,
	mikew@...gle.com, fchecconi@...il.com, paolo.valente@...more.it,
	ryov@...inux.co.jp, fernando@....ntt.co.jp, s-uchida@...jp.nec.com,
	taka@...inux.co.jp, guijianfeng@...fujitsu.com, jmoyer@...hat.com,
	dhaval@...ux.vnet.ibm.com, balbir@...ux.vnet.ibm.com,
	righi.andrea@...il.com, m-ikeda@...jp.nec.com, agk@...hat.com,
	akpm@...ux-foundation.org, peterz@...radead.org,
	torvalds@...ux-foundation.org, mingo@...e.hu, riel@...hat.com
Subject: Re: [RFC] IO scheduler based IO controller V9

Vivek Goyal wrote:
> On Fri, Sep 11, 2009 at 03:16:23PM +0200, Jerome Marchand wrote:
>> Vivek Goyal wrote:
>>> On Thu, Sep 10, 2009 at 04:52:27PM -0400, Vivek Goyal wrote:
>>>> On Thu, Sep 10, 2009 at 05:18:25PM +0200, Jerome Marchand wrote:
>>>>> Vivek Goyal wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is the V9 of the IO controller patches generated on top of 2.6.31-rc7.
>>>>>  
>>>>> Hi Vivek,
>>>>>
>>>>> I've run some postgresql benchmarks for io-controller. Tests have been
>>>>> made with 2.6.31-rc6 kernel, without io-controller patches (when
>>>>> relevant) and with io-controller v8 and v9 patches.
>>>>> I set up two instances of the TPC-H database, each running in their
>>>>> own io-cgroup. I ran two clients to these databases and tested on each
>>>>> that simple request:
>>>>> $ select count(*) from LINEITEM;
>>>>> where LINEITEM is the biggest table of TPC-H (6001215 entries,
>>>>> 720MB). That request generates a steady stream of IOs.
>>>>>
>>>>> Time is measure by psql (\timing switched on). Each test is run twice
>>>>> or more if there is any significant difference between the first two
>>>>> runs. Before each run, the cache is flush:
>>>>> $ echo 3 > /proc/sys/vm/drop_caches
>>>>>
>>>>>
>>>>> Results with 2 groups of same io policy (BE) and same io weight (1000):
>>>>>
>>>>> 	w/o io-scheduler	io-scheduler v8		io-scheduler v9
>>>>> 	first	second		first	second		first	second
>>>>> 	DB	DB		DB	DB		DB	DB
>>>>>
>>>>> CFQ	48.4s	48.4s		48.2s	48.2s		48.1s	48.5s
>>>>> Noop	138.0s	138.0s		48.3s	48.4s		48.5s	48.8s
>>>>> AS	46.3s	47.0s		48.5s	48.7s		48.3s	48.5s
>>>>> Deadl.	137.1s	137.1s		48.2s	48.3s		48.3s	48.5s
>>>>>
>>>>> As you can see, there is no significant difference for CFQ
>>>>> scheduler.
>>>> Thanks Jerome.  
>>>>
>>>>> There is big improvement for noop and deadline schedulers
>>>>> (why is that happening?).
>>>> I think because now related IO is in a single queue and it gets to run
>>>> for 100ms or so (like CFQ). So previously, IO from both the instances
>>>> will go into a single queue which should lead to more seeks as requests
>>>> from two groups will kind of get interleaved.
>>>>
>>>> With io controller, both groups have separate queues so requests from
>>>> both the data based instances will not get interleaved (This almost
>>>> becomes like CFQ where ther are separate queues for each io context
>>>> and for sequential reader, one io context gets to run nicely for certain
>>>> ms based on its priority).
>>>>
>>>>> The performance with anticipatory scheduler
>>>>> is a bit lower (~4%).
>>>>>
>>> Hi Jerome, 
>>>
>>> Can you also run the AS test with io controller patches and both the
>>> database in root group (basically don't put them in to separate group). I 
>>> suspect that this regression might come from that fact that we now have
>>> to switch between queues and in AS we wait for request to finish from
>>> previous queue before next queue is scheduled in and probably that is
>>> slowing down things a bit.., just a wild guess..
>>>
>> Hi Vivek,
>>
>> I guess that's not the reason. I got 46.6s for both DB in root group with
>> io-controller v9 patches. I also rerun the test with DB in different groups
>> and found about the same result as above (48.3s and 48.6s).
>>
> 
> Hi Jerome,
> 
> Ok, so when both the DB's are in root group (with io-controller V9
> patches), then you get 46.6 seconds time for both the DBs. That means there
> is no regression in this case. In this case there is only one queue of 
> root group and AS is running timed read/write batches on this queue.
> 
> But when both the DBs are put in separate groups then you get 48.3 and
> 48.6 seconds respectively and we see regression. In this case there are
> two queues belonging to each group. Elevator layer takes care of queue
> group queue switch and AS runs timed read/write batches on these queues.
> 
> If it is correct, then it does not exclude the possiblity that it is queue
> switching overhead between groups?

Yes it's correct. I misunderstood you.

Jerome

>  
> Thanks
> Vivek

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/