linux-ext4 - Re: [PATCH 2/2] improve ext3 fsync batching

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20080819133423.7b8175d6.akpm@linux-foundation.org>
Date:	Tue, 19 Aug 2008 13:34:23 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	rwheeler@...hat.com
Cc:	adilger@....com, jbacik@...hat.com, linux-kernel@...r.kernel.org,
	tglx@...utronix.de, linux-fsdevel@...r.kernel.org,
	chris.mason@...cle.com, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 2/2] improve ext3 fsync batching

On Tue, 19 Aug 2008 14:43:27 -0400
Ric Wheeler <rwheeler@...hat.com> wrote:

> Andrew Morton wrote:
> > On Tue, 19 Aug 2008 07:01:11 -0400 Ric Wheeler <rwheeler@...hat.com> wrote:
> >
> >   
> >> It would be great to be able to use this batching technique for faster 
> >> devices, but we currently sleep 3-4 times longer waiting to batch for an 
> >> array than it takes to complete the transaction.
> >>     
> >
> > Obviously, tuning that delay down to the minimum necessary is a good
> > thing.  But doing it based on commit-time seems indirect at best.  What
> > happens on a slower disk when commit times are in the tens of
> > milliseconds?  When someone runs a concurrent `dd if=/dev/zero of=foo'
> > when commit times go up to seconds?
> >
> > Perhaps a better scheme would be to tune it based on how many other
> > processes are joining that transaction.  If it's "zero" then decrease
> > the timeout.  But one would need to work out how to increase it, which
> > perhaps could be done by detecting the case where process A runs an
> > fsync when a commit is currently in progress, and that commit was
> > caused by process B's fsync.
> >
> > But before doing all that I would recommend/ask that the following be
> > investigated:
> >
> > - How effective is the present code?
> >
> >   - What happens when it is simply removed?
> >
> >   - Add instrumentation (a counter and a printk) to work out how
> >     many other tasks are joining this task's transaction.
> >
> >     - If the answer is "zero" or "small", work out why.
> >
> >   - See if we can increase its effectiveness.
> >
> > Because it could be that the code broke.  There might be issues with
> > higher-level locks which are preventing the batching.  For example, if
> > all the files which the test app is syncing are in the same directory,
> > perhaps all the tasks are piling up on that directory's i_mutex?
> >   
> 
> One other way to think about this is as a fairly normal queuing problem:
> 
>     (1) arrival rate is the rate at which we see new tasks coming into 
> the code
>     (2) service time is basically the time spent committing the 
> transaction to storage
> 
> and we have the assumption that some number of tasks can join a 
> transaction more or less for "free."
> 
> What the existing code assumes is that all devices have an equal service 
> time. That worked well as long as we only looked at devices that were 
> roughly equal (10-20 ms latencies) or used a higher HZ for the kernel 
> (1000HZ and you don't see this as much as with 100HZ).
> 
> The two key issues that Josef's code tried to address are that first 
> assumption that all devices have a similar service time and the tie 
> between how long we wait and the HZ.

yes, I see the (indirect) logic.  But I wonder about whether there's a
direct way of measuring the thing we really want to measure.

Also, I'd be heaps less scared of the change if it did

	commit_time = min(commit_time, one jiffy)


> It would seem to be generically 
> useful to be able to sleep for less than 1 jiffie, not just for file 
> systems, but maybe also in some other contexts?

Sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html