linux-kernel - Re: [PATCH 9/9] ext3: do not throttle metadata and journal IO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090421181429.GO19637@balbir.in.ibm.com>
Date:	Tue, 21 Apr 2009 23:44:29 +0530
From:	Balbir Singh <balbir@...ux.vnet.ibm.com>
To:	Theodore Tso <tytso@....edu>,
	Andrea Righi <righi.andrea@...il.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Paul Menage <menage@...gle.com>,
	Gui Jianfeng <guijianfeng@...fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	agk@...rceware.org, akpm@...ux-foundation.org,
	baramsori72@...il.com, Carl Henrik Lunde <chlunde@...g.uio.no>,
	dave@...ux.vnet.ibm.com, Divyesh Shah <dpshah@...gle.com>,
	eric.rannaud@...il.com, fernando@....ntt.co.jp,
	Hirokazu Takahashi <taka@...inux.co.jp>,
	Li Zefan <lizf@...fujitsu.com>, matt@...ehost.com,
	dradford@...ehost.com, ngupta@...gle.com, randy.dunlap@...cle.com,
	roberto@...it.it, Ryo Tsuruta <ryov@...inux.co.jp>,
	Satoshi UCHIDA <s-uchida@...jp.nec.com>,
	subrata@...ux.vnet.ibm.com, yoshikawa.takuya@....ntt.co.jp,
	containers@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 9/9] ext3: do not throttle metadata and journal IO

* Theodore Tso <tytso@....edu> [2009-04-21 13:46:20]:

> On Tue, Apr 21, 2009 at 10:53:17PM +0530, Balbir Singh wrote:
> > Coming to the dirty page tracking issue, the issue that is being
> > brought about is the same issue that we have shared page accounting. I
> > am working on estimates for shared page accounting and it should be
> > possible to extend it to dirty shared page accounting. Using the
> > shared ratios for decisions might be a better strategy.
> 
> It's the same issue, but again, consider the use case where the
> readers and the writers are in different cgroups.  This can happen
> quite often in database workloads, where you might have many readers,
> and a single process doing the database update.  Or the case where you
> have one process in one cgroup doing a tail -f of some log file, and
> another process doing writing to the log file.
> 

That would be true in general, but only the process writing to the
file will dirty it. So dirty already accounts for the read/write
split. I'd assume that the cost is only for the dirty page, since we
do IO only on write in this case, unless I am missing something very
obvious.

> Using a shared ratio is certainly better than charging 100% of the
> write to whichever unfortunate process happened to first read the
> page, but it will still not be terribly accurate.  A lot really
> depends on how you expect these cgroup limits will be used, and what
> the requirements actually will be with respect to accuracy.  If the
> requirements for accuracy are different for RSS tracking and dirty
> page tracking --- which could easily be the case, since memory is
> usually much cheaper than I/O bandwidth, and there is generally far
> more clean memory pages than there are dirty memory pages, so a small
> numberical error in dirty page accounting translates to a much larger
> percentage error than read-only RSS page accounting --- it may make
> sense to use different mechanisms for tracking the two, given the
> different requirements and differring overhead implications.
>
> Anyway, something for you to think about.

Yep, but I would recommend using the controller we have, if the
overheads span out to be too large for IO, we think about
alternatives.

-- 
	Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/