linux-kernel - Re: RFC: I/O bandwidth controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2891419e0808130047o702c9174tf36a4316ed247497@mail.gmail.com>
Date:	Wed, 13 Aug 2008 16:47:25 +0900
From:	"Dong-Jae Kang" <baramsori72@...il.com>
To:	righi.andrea@...il.com
Cc:	"Fernando Luis Vázquez Cao" 
	<fernando@....ntt.co.jp>,
	"Hirokazu Takahashi" <taka@...inux.co.jp>,
	balbir@...ux.vnet.ibm.com, xen-devel@...ts.xensource.com,
	"Satoshi UCHIDA" <s-uchida@...jp.nec.com>,
	containers@...ts.linux-foundation.org,
	linux-kernel@...r.kernel.org,
	virtualization@...ts.linux-foundation.org, dm-devel@...hat.com,
	agk@...rceware.org, dave@...ux.vnet.ibm.com, ngupta@...gle.com
Subject: Re: RFC: I/O bandwidth controller

Hi,

2008/8/13 Andrea Righi <righi.andrea@...il.com>:
> Fernando Luis Vázquez Cao wrote:
>> On Tue, 2008-08-12 at 22:29 +0900, Andrea Righi wrote:
>>> Andrea Righi wrote:
>>>> Hirokazu Takahashi wrote:
>>>>>>>>>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects
>>>>>>>>>>
>>>>>>>>>> The implementation of an I/O scheduling algorithm is to a certain extent
>>>>>>>>>> influenced by what we are trying to achieve in terms of I/O bandwidth
>>>>>>>>>> shaping, but, as discussed below, the required accuracy can determine
>>>>>>>>>> the layer where the I/O controller has to reside. Off the top of my
>>>>>>>>>> head, there are three basic operations we may want perform:
>>>>>>>>>>   - I/O nice prioritization: ionice-like approach.
>>>>>>>>>>   - Proportional bandwidth scheduling: each process/group of processes
>>>>>>>>>> has a weight that determines the share of bandwidth they receive.
>>>>>>>>>>   - I/O limiting: set an upper limit to the bandwidth a group of tasks
>>>>>>>>>> can use.
>>>>>>>>> Use a deadline-based IO scheduling could be an interesting path to be
>>>>>>>>> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth
>>>>>>>>> requirements.
>>>>>>>> Please note that the only thing we can do is to guarantee minimum
>>>>>>>> bandwidth requirement when there is contention for an IO resource, which
>>>>>>>> is precisely what a proportional bandwidth scheduler does. An I missing
>>>>>>>> something?
>>>>>>> Correct. Proportional bandwidth automatically allows to guarantee min
>>>>>>> requirements (instead of IO limiting approach, that needs additional
>>>>>>> mechanisms to achive this).
>>>>>>>
>>>>>>> In any case there's no guarantee for a cgroup/application to sustain
>>>>>>> i.e. 10MB/s on a certain device, but this is a hard problem anyway, and
>>>>>>> the best we can do is to try to satisfy "soft" constraints.
>>>>>> I think guaranteeing the minimum I/O bandwidth is very important. In the
>>>>>> business site, especially in streaming service system, administrator requires
>>>>>> the functionality to satisfy QoS or performance of their service.
>>>>>> Of course, IO throttling is important, but, personally, I think guaranteeing
>>>>>> the minimum bandwidth is more important than limitation of maximum bandwidth
>>>>>> to satisfy the requirement in real business sites.
>>>>>> And I know Andrea's io-throttle patch supports the latter case well and it is
>>>>>> very stable.
>>>>>> But, the first case(guarantee the minimum bandwidth) is not supported in any
>>>>>> patches.
>>>>>> Is there any plans to support it? and Is there any problems in implementing it?
>>>>>> I think if IO controller can support guaranteeing the minimum bandwidth and
>>>>>> work-conserving mode simultaneously, it more easily satisfies the requirement
>>>>>> of the business sites.
>>>>>> Additionally, I didn't understand "Proportional bandwidth automatically allows
>>>>>> to guarantee min
>>>>>> requirements" and "soft constraints".
>>>>>> Can you give me a advice about this ?
>>>>>> Thanks in advance.
>>>>>>
>>>>>> Dong-Jae Kang
>>>>> I think this is what dm-ioband does.
>>>>>
>>>>> Let's say you make two groups share the same disk, and give them
>>>>> 70% of the bandwidth the disk physically has and 30% respectively.
>>>>> This means the former group is almost guaranteed to be able to use
>>>>> 70% of the bandwidth even when the latter one is issuing quite
>>>>> a lot of I/O requests.
>>>>>
>>>>> Yes, I know there exist head seek lags with traditional magnetic disks,
>>>>> so it's important to improve the algorithm to reduce this overhead.
>>>>>
>>>>> And I think it is also possible to add a new scheduling policy to
>>>>> guarantee the minimum bandwidth. It might be cool if some group can
>>>>> use guranteed bandwidths and the other share the rest on proportional
>>>>> bandwidth policy.
>>>>>
>>>>> Thanks,
>>>>> Hirokazu Takahashi.
>>>> With IO limiting approach minimum requirements are supposed to be
>>>> guaranteed if the user configures a generic block device so that the sum
>>>> of the limits doesn't exceed the total IO bandwidth of that device. But,
>>>> in principle, there's nothing in "throttling" that guarantees "fairness"
>>>> among different cgroups doing IO on the same block devices, that means
>>>> there's nothing to guarantee minimum requirements (and this is the
>>>> reason because I liked the Satoshi's CFQ-cgroup approach together with
>>>> io-throttle).
>>>>
>>>> A more complicated issue is how to evaluate the total IO bandwidth of a
>>>> generic device. We can use some kind of averaging/prediction, but
>>>> basically it would be inaccurate due to the mechanic of disks (head
>>>> seeks, but also caching, buffering mechanisms implemented directly into
>>>> the device, etc.). It's a hard problem. And the same problem exists also
>>>> for proportional bandwidth as well, in terms of IO rate predictability I
>>>> mean.
>>> BTW as I said in a previous email, an interesting path to be explored
>>> IMHO could be to think in terms of IO time. So, look at the time an IO
>>> request is issued to the drive, look at the time the request is served,
>>> evaluate the difference and charge the consumed IO time to the
>>> appropriate cgroup. Then dispatch IO requests in function of the
>>> consumed IO time debts / credits, using for example a token-bucket
>>> strategy. And probably the best place to implement the IO time
>>> accounting is the elevator.
>> Please note that the seek time for a specific IO request is strongly
>> correlated with the IO requests that preceded it, which means that the
>> owner of that request is not the only one to blame if it takes too long
>> to process it. In other words, with the algorithm you propose we may end
>> up charging the wrong guy.
>
> mmh.. yes. The only scenario I can imagine where this solution is not
> fair is when there're a lot of guys always requesting the same near
> blocks and a single guy looking for a single distant block (supposing
> disk seeks are more expensive than read/write ops).
>
> In this case it would be fair to charge a huge amount only to the guy
> requesting the single distant block and distribute the cost of the seek
> to move back the head equally among the other guys. Using the algorighm
> I proposed, instead, both the single "bad" guy and the first "good" guy
> that moves back the disk head would spend a large sum of IO credits.
>

I have a question about your description.
In I/O controlling, how do you think about the meaning of "fair" among cgroups ?
These days I was confused about it.
IMHO, if they have a same access time and same access opportunity for
disk I/O regardless of their I/O style(sequential / random / mixed /
…), I think it is fare.
Of course, in this fair situation, the cgroups with same priority or
weight can have a different I/O bandwidth. but, I think it will be in
reasonable range.
So, if other cgroups with fast I/O was sacrificed for the cgroup with
too late I/O to equaliz the I/O quantity, it can be considered
"unfair" for the cgroup with fast I/O
Do I have something wrong about the "fair" concept?
This is just my opinion :)
I welcome and appreciate for other opinions and comments about this

PS)
Andrea, this question is not related to the io-controller
But, I just wonder your another project, network io-throttle, is going
on now? My colleague has researched the similar project and he is try
to implement another one. And i am also interested in net
io-controller. Thank you

Dong-Jae Kang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/