linux-kernel - Re: [RFC] writeback and cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 4 Apr 2012 16:32:39 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Fengguang Wu <fengguang.wu@...el.com>, Jan Kara <jack@...e.cz>,
	Jens Axboe <axboe@...nel.dk>, linux-mm@...ck.org,
	sjayaraman@...e.com, andrea@...terlinux.com, jmoyer@...hat.com,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	kamezawa.hiroyu@...fujitsu.com, lizefan@...wei.com,
	containers@...ts.linux-foundation.org, cgroups@...r.kernel.org,
	ctalbott@...gle.com, rni@...gle.com, lsf@...ts.linux-foundation.org
Subject: Re: [RFC] writeback and cgroup

On Wed, Apr 04, 2012 at 11:49:09AM -0700, Tejun Heo wrote:

[..]

> Thirdly, I don't see how writeback can control all the IOs.  I mean,
> what about reads or direct IOs?  It's not like IO devices have
> separate channels for those different types of IOs.  They interact
> heavily.

> Let's say we have iops/bps limitation applied on top of proportional IO
> distribution

We already do that. First IO is subjected to throttling limit and only 
then it is passed to the elevator to do the proportional IO. So throttling
is already stacked on top of proportional IO. The only question is 
should it be pushed to even higher layers or not.

> or a device holds two partitions and one
> of them is being used for direct IO w/o filesystems.  How would that
> work?  I think the question goes even deeper, what do the separate
> limits even mean?

Separate limits for buffered writes are just filling the gap. Agreed it
is not a very neat solution.

>  Does the IO sched have to calculate allocation of
> IO resource to different types of IOs and then give a "number" to
> writeback which in turn enforces that limit?  How does the elevator
> know what number to give?  Is the number iops or bps or weight?

If we push up all the throttling somewhere in higher layer, say some
of kind of per bdi throttling interface, then elevator just have to
worry about doing proportional IO. No interaction with higher layers
regarding iops/bps etc. (Not that elevator worries about it today).

> If
> the iosched doesn't know how much write workload exists, how does it
> distribute the surplus buffered writeback resource across different
> cgroups?  If so, what makes the limit actualy enforceable (due to
> inaccuracies in estimation, fluctuation in workload, delay in
> enforcement in different layers and whatnot) except for block layer
> applying the limit *again* on the resulting stream of combined IOs?

So split model is definitely confusing. Anyway, block layer will not
apply the limits again as flusher IO will go in root cgroup which 
generally goes to root which is unthrottled generally. Or flusher
could mark the bios with a flag saying "do not throttle" bios again as
these have been throttled already. So throttling again is probably not
an issue. 

In summary, agreed that split is confusing and it fills a gap existing
today.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/