[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <loom.20100725T211257-863@post.gmane.org>
Date: Sun, 25 Jul 2010 19:13:50 +0000 (UTC)
From: Suli Yang <yangsuli@...il.com>
To: linux-kernel@...r.kernel.org
Subject: question about adding checksumming into ext2
Hell, everyone,
I am trying to modify the ext2 file system to include checksuming on a per
block basis. That is, for each pointer in the file system, I would like to add
one field to store the checksum. When writing a block, the fs will compute the
checksum for that block of data and store it in the appropriate place; when
reading a block, the fs will also compute the checksum and compare it to the
checksum previously stored, if there’s a mismatch, fs reports an error.
I am not quite sure what I should do here…. So I am wondering if anyone could
give me a hand, or direct me to some resources which might be helpful?
Basically what I have done is adding fields in inode and indirect block to
include checksum. (Now I put them immediately after the pointer, and each of
the checksum occupies the space of a __le32. That’s probably not the best
design because it would be very difficult if I want to change the size of the
checksum; but what’s done is done…) For this purpose I also changed the way
the ext2 file system goes through the pointers in the file metadata ( to be
more specific, the ext2_get_branch, ext2_alloc_branch, ext2_splice_branch,
ext2_getblocks functions, etc..) For this part, I have already compiled the
code and done some basic tests. It seems running OK.
So what need to do now is to modify the read/write function of ext2, so that
when read in /write out a block of data, it also does the checksuming stuff.
So my question is: where should I make the modification????
>From my understanding I think I have two choices (plz correct me if I am
wrong): one is to do checksuming when we are reading from/writing to the page
cache. That is, to modify something like file->f_op->aio_read/aio_write, or
address_space->a_ops->write_end. The benefit of doing this is that it would be
easier to get the block numbers (relative to the beginning of the file) from
the range of file which needs to be read/written , and thus which checksum to
modify (It’s not that simple, though….). The drawback, however, is that we
have to compute the checksum each time we access page cache, which would be a
significant cost. Also, there are a lot of places where we change page cache…
e.g, regular file read/write, nobh write, file mapping…. etc. It would be
quite difficult to locate all those places and it may not be good practice
The other possibility is to modify the process when we read data from the
disk / commit write to the disk. The advantage of doing this is obvious;
however, there are still some problems: 1. The code which actually access the
disk may be everywhere, too. From reading the code I learned that both
generic_file_aio_write and background threads which are responsible for
flushing dirty pages into disk eventually called ext2_writepage (
ext2_writepages will call __mpage_writepage, which will then call
ext2_writepage in case of an ext2 file, right?). However, for many other
cases.say nobh writing, file mapping, direct I/0, do they all eventually call
ext2_writepage or not? I guess I still need to find out. 2. Functions which
deal with reading/writing data from/ino disk are such low level so that their
argument is something like bio instead of file position and range; and it’s
not straightforward in concept to get the position where we are
reading/writing the file from the bio. 3. Some file operation (say,
simple_fsync, which is called by ext2_fsync), actually called ll_rw_block, and
bypassed the ext2 writepage(s) functions. What should I do in this case?
Thank you very much for your attention and help
p.s I noticed that the generic_file_sync function first called
filemap_wrte_and_wait_range() to sync a file, then again it called
sync_mapping_buffers to sync a file, and again it called sync_inode() to sync
the very same file. Is there a particular reason for such a redundancy? Thank
you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists