[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1238260946.29177.11.camel@surfer>
Date: Sat, 28 Mar 2009 12:22:26 -0500
From: David Hagood <david.hagood@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Stefan Richter <stefanr@...6.in-berlin.de>,
Mark Lord <lkml@....ca>, Jeff Garzik <jeff@...zik.org>,
Matthew Garrett <mjg59@...f.ucam.org>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Theodore Tso <tytso@....edu>,
Andrew Morton <akpm@...ux-foundation.org>,
David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29
What if you added another phase in the journaling, after the data is
written to the kernel, but before block allocation.
As I understand, the current scenario goes like this:
1) A program writes a bunch of data to a file.
2) The kernel holds the data in buffer cache, delaying allocation.
3) Kernel updates file metadata in journal.
4) Some time later, kernel allocates blocks and writes data.
If things go boom between 3 and 4, you have the files in an inconsistent
state. If the program does an fasync(), then the kernel has to write ALL
data out to be consistent.
What if you could do this:
1) A program writes a bunch of data to a file.
2) The kernel holds the data in buffer cache, delaying allocation.
3) The kernel writes a record to the journal saying "This data goes with
this file, but I've not allocated any blocks for it yet."
4) Kernel updates file metadata in journal.
5) Sometime later, kernel allocates blocks for data, and notes the
allocation in the journal.
6) Sometime later still the kernel commits the data to disk and update
the journal.
It seems to me this would be a not-unreasonable way to have both the
advantages of delayed allocation AND get the data onto disk quickly.
If the user wants to have speed over safety, you could skip steps 3 and
5 (data=ordered). You want safety, you force everything through steps 3
and 5 (data=journaled). You want a middle ground, you only do steps 3
and 5 for files where the program has done an fasync() (data=ordered +
program calls fasync()).
And if you want both speed and safety, you get a big battery-backed up
RAM disk as the journal device and journal everything.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists