linux-kernel - Re: [PATCH 8/8] vm: Add an tuning knob for vm.max_writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1251876776.7547.52.camel@twins>
Date:	Wed, 02 Sep 2009 09:32:56 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Theodore Tso <tytso@....edu>
Cc:	Christoph Hellwig <hch@...radead.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	chris.mason@...cle.com, david@...morbit.com,
	akpm@...ux-foundation.org, jack@...e.cz
Subject: Re: [PATCH 8/8] vm: Add an tuning knob for vm.max_writeback_pages

On Tue, 2009-09-01 at 16:27 -0400, Theodore Tso wrote:
> On Tue, Sep 01, 2009 at 02:44:55PM -0400, Christoph Hellwig wrote:
> > On Tue, Sep 01, 2009 at 08:38:55PM +0200, Peter Zijlstra wrote:
> > > Do we really need a tunable for this?
> > 
> > It will make increasing it in the field a lot easier.  And having deal
> > with really large systems I have the fear that there are I/O topologies
> > outhere for which every "reasonable" value is too low.
> > 
> > > I guess we need a limit to avoid it writing out everything, but can't we
> > > have something automagic?
> > 
> > Some automatic adjustment would be nice.  But finding the right auto
> > tuning will be an interesting exercise.
> 
> The fact that limit is on a per-inode basis is part of the problem.

I would think that it would be a BDI based property, since it basically
depends on the speed of the backing dev you're writing to.

> Right now, we are only writing out X pages per inode, so depending on
> whether we have one really gargantuan inode that needs writout, or ten
> big inodes which are dirty, or million small inodes, the fact that we
> are imposing a limit based the number of pages in a single inode that
> we will write out seems like the wrong design choice.

Agreed, number of chunks, where a chunk is some optimum write size for
the device in question, and number of seeks, seem a more suitable
criteria.

Basically limiting the time spend on writeout and not much else.

> So perhaps the best argument for not making this be a tunable is that
> in the long run, we will need to put in a better algorithm for
> controlling how much writeback we want to do before we start
> saturating RAID arrays, and in that new algorithm this tunable may no
> longer make sense.  Fine; at that point, we can make it go away.  For
> now, though, it seems to be the best way to tweak what is going on,
> since I doubt we'll be able to come up with one magic number that will
> satisfy everyone.

Thing is, will this single tunable be sufficient for people who have
both a RAID array and an USB stick on the same machine?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/