lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 28 Aug 2008 09:34:49 -0400
From:	Theodore Tso <tytso@....edu>
To:	Frédéric Bohé <frederic.bohe@...l.net>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: [PATCH 2/4] Create the journal in the middle of the filesystem

On Thu, Aug 28, 2008 at 11:55:21AM +0200, Frédéric Bohé wrote:
> With 512 groups by flex group, meta-datas for a single flex-group are 8
> groups long ! If we have no luck and there are a bunch of groups
> occupied by meta-datas at the middle of the filesystem, we should
> slightly increase the number of groups scanned to find a completely free
> group.

I'm not sure it ever makes sense to use such a huge -G setting, but
yes, you're right.  It actually wasn't a major tragedy, since this
just specifies the goal block, and so the block allocator would just
search forward to find the first free block.  But it is better to move
forward to the next free block group, so we leave space for interior
nodes of the extent tree to be allocated.

The following patch takes into account the flex_bg size, and will
stash the journal in the first free block group after metadata; we do
by starting at a flex_bg boundary, and then searching forward until
bg_free_blocks_count is non-zero.  However, if the number of block
groups is less than half of the flex_bg size, we'll just give up and
throw it at the mid-point of the filesystem, since that (plus using
extents instead of indirect blocks) is really the major optimization
here.  

One or two discontinuities in the journal file really isn't a big
deal, since we're normally seaking back and forth between the rest of
the filesystem data blocks and the journal anyway.  The best benchmark
to see a problem isn't going to be bonnie, but something that which is
extremely fsync-intensive.

						- Ted

diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 96b574e..f5a9dba 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -275,7 +275,7 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 				     blk_t size, int flags)
 {
 	char			*buf;
-	dgrp_t			group, start, end, i;
+	dgrp_t			group, start, end, i, log_flex;
 	errcode_t		retval;
 	struct ext2_inode	inode;
 	struct mkjournal_struct	es;
@@ -311,7 +311,17 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
 	 */
 	group = ext2fs_group_of_blk(fs, (fs->super->s_blocks_count -
 					 fs->super->s_first_data_block) / 2);
-	start = (group > 0) ? group-1 : group;
+	log_flex = 1 << fs->super->s_log_groups_per_flex;
+	if (fs->super->s_log_groups_per_flex && (group > log_flex)) {
+		group = group & ~(log_flex - 1);
+		while ((group < fs->group_desc_count) &&
+		       fs->group_desc[group].bg_free_blocks_count == 0)
+			group++;
+		if (group == fs->group_desc_count)
+			group = 0;
+		start = group;
+	} else
+		start = (group > 0) ? group-1 : group;
 	end = ((group+1) < fs->group_desc_count) ? group+1 : group;
 	group = start;
 	for (i=start+1; i <= end; i++)

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ