linux-kernel - Re: RFC: I/O bandwidth controller (was Re: Too many I/O controller patches)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48A0A689.40908@gmail.com>
Date:	Mon, 11 Aug 2008 22:52:25 +0200
From:	Andrea Righi <righi.andrea@...il.com>
To:	Fernando Luis Vázquez Cao 
	<fernando@....ntt.co.jp>
CC:	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Ryo Tsuruta <ryov@...inux.co.jp>,
	yoshikawa.takuya@....ntt.co.jp, taka@...inux.co.jp,
	uchida@...jp.nec.com, ngupta@...gle.com,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com,
	containers@...ts.linux-foundation.org,
	virtualization@...ts.linux-foundation.org,
	xen-devel@...ts.xensource.com, agk@...rceware.org
Subject: Re: RFC: I/O bandwidth controller (was Re: Too many I/O controller
 patches)

Fernando Luis Vázquez Cao wrote:
>>> This seems to be the easiest part, but the current cgroups
>>> infrastructure has some limitations when it comes to dealing with block
>>> devices: impossibility of creating/removing certain control structures
>>> dynamically and hardcoding of subsystems (i.e. resource controllers).
>>> This makes it difficult to handle block devices that can be hotplugged
>>> and go away at any time (this applies not only to usb storage but also
>>> to some SATA and SCSI devices). To cope with this situation properly we
>>> would need hotplug support in cgroups, but, as suggested before and
>>> discussed in the past (see (0) below), there are some limitations.
>>>
>>> Even in the non-hotplug case it would be nice if we could treat each
>>> block I/O device as an independent resource, which means we could do
>>> things like allocating I/O bandwidth on a per-device basis. As long as
>>> performance is not compromised too much, adding some kind of basic
>>> hotplug support to cgroups is probably worth it.
>>>
>>> (0) http://lkml.org/lkml/2008/5/21/12
>> What about using major,minor numbers to identify each device and account
>> IO statistics? If a device is unplugged we could reset IO statistics
>> and/or remove IO limitations for that device from userspace (i.e. by a
>> deamon), but pluggin/unplugging the device would not be blocked/affected
>> in any case. Or am I oversimplifying the problem?
> If a resource we want to control (a block device in this case) is
> hot-plugged/unplugged the corresponding cgroup-related structures inside
> the kernel need to be allocated/freed dynamically, respectively. The
> problem is that this is not always possible. For example, with the
> current implementation of cgroups it is not possible to treat each block
> device as a different cgroup subsytem/resource controlled, because
> subsystems are created at compile time.

The whole subsystem is created at compile time, but controller data
structures are allocated dynamically (i.e. see struct mem_cgroup for
memory controller). So, identifying each device with a name, or a key
like major,minor, instead of a reference/pointer to a struct could help
to handle this in userspace. I mean, if a device is unplugged a
userspace daemon can just handle the event and delete the controller
data structures allocated for this device, asynchronously, via
userspace->kernel interface. And without holding a reference to that
particular block device in the kernel. Anyway, implementing a generic
interface that would allow to define hooks for hot-pluggable devices (or
similar events) in cgroups would be interesting.

>>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects
>>>
>>> The implementation of an I/O scheduling algorithm is to a certain extent
>>> influenced by what we are trying to achieve in terms of I/O bandwidth
>>> shaping, but, as discussed below, the required accuracy can determine
>>> the layer where the I/O controller has to reside. Off the top of my
>>> head, there are three basic operations we may want perform:
>>>   - I/O nice prioritization: ionice-like approach.
>>>   - Proportional bandwidth scheduling: each process/group of processes
>>> has a weight that determines the share of bandwidth they receive.
>>>   - I/O limiting: set an upper limit to the bandwidth a group of tasks
>>> can use.
>> Use a deadline-based IO scheduling could be an interesting path to be
>> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth
>> requirements.
> Please note that the only thing we can do is to guarantee minimum
> bandwidth requirement when there is contention for an IO resource, which
> is precisely what a proportional bandwidth scheduler does. An I missing
> something?

Correct. Proportional bandwidth automatically allows to guarantee min
requirements (instead of IO limiting approach, that needs additional
mechanisms to achive this).

In any case there's no guarantee for a cgroup/application to sustain
i.e. 10MB/s on a certain device, but this is a hard problem anyway, and
the best we can do is to try to satisfy "soft" constraints.

-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/