lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 20 Sep 2021 20:42:03 -0700
From:   Sarthak Kukreti <sarthakkukreti@...omium.org>
To:     linux-ext4@...r.kernel.org
Cc:     gwendal@...omium.org, tytso@....edu,
        Sarthak Kukreti <sarthakkukreti@...omium.org>
Subject: [PATCH] mke2fs: Add extended option for prezeroed storage devices

From: Sarthak Kukreti <sarthakkukreti@...omium.org>

This patch adds an extended option "assume_storage_prezeroed" to
mke2fs. When enabled, this option acts as a hint to mke2fs that
the underlying block device was zeroed before mke2fs was called.
This allows mke2fs to optimize out the zeroing of the inode
table and the journal, which speeds up the filesystem creation
time.

Additionally, on thinly provisioned storage devices (like Ceph,
dm-thin), reads on unmapped extents return zero. This property
allows mke2fs (with assume_storage_prezeroed) to avoid
pre-allocating metadata space for inode tables for the entire
filesystem and saves space that would normally be preallocated
for zero inode tables.

Testing on ChromeOS (running linux kernel 4.19) with dm-thin
and 200GB thin logical volumes using 'mke2fs -t ext4 <dev>':

- Time taken by mke2fs drops from 1.07s to 0.08s.
- Avoiding zeroing out the inode table and journal reduces the
  initial metadata space allocation from 0.48% to 0.01%.
- Lazy inode table zeroing results in a further 1.45% of logical
  volume space getting allocated for inode tables, even if not file
  data is added to the filesystem. With assume_storage_prezeroed,
  the metadata allocation remains at 0.01%.

Signed-off-by: Sarthak Kukreti <sarthakkukreti@...omium.org>
---
 misc/mke2fs.8.in |  6 ++++++
 misc/mke2fs.c    | 21 ++++++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/misc/mke2fs.8.in b/misc/mke2fs.8.in
index c0b53245..b82f8445 100644
--- a/misc/mke2fs.8.in
+++ b/misc/mke2fs.8.in
@@ -364,6 +364,12 @@ This speeds up file system initialization noticeably, but carries some
 small risk if the system crashes before the journal has been overwritten
 entirely one time.  If the option value is omitted, it defaults to 1 to
 enable lazy journal inode zeroing.
+.B assume_storage_prezeroed\fR[\fB= \fI<0 to disable, 1 to enable>\fR]
+If enabled,
+.BR mke2fs
+assumes that the storage device has been prezeroed, skips zeroing the journal
+and inode tables, and annotates the block group flags to signal that the inode
+table has been zeroed.
 .TP
 .B no_copy_xattrs
 Normally
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 04b2fbce..5293d9b0 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -95,6 +95,7 @@ int	journal_size;
 int	journal_flags;
 int	journal_fc_size;
 static int	lazy_itable_init;
+static int	assume_storage_prezeroed;
 static int	packed_meta_blocks;
 int		no_copy_xattrs;
 static char	*bad_blocks_filename = NULL;
@@ -1012,6 +1013,11 @@ static void parse_extended_opts(struct ext2_super_block *param,
 				lazy_itable_init = strtoul(arg, &p, 0);
 			else
 				lazy_itable_init = 1;
+		} else if (!strcmp(token, "assume_storage_prezeroed")) {
+			if (arg)
+				assume_storage_prezeroed = strtoul(arg, &p, 0);
+			else
+				assume_storage_prezeroed = 1;
 		} else if (!strcmp(token, "lazy_journal_init")) {
 			if (arg)
 				journal_flags |= strtoul(arg, &p, 0) ?
@@ -1115,7 +1121,8 @@ static void parse_extended_opts(struct ext2_super_block *param,
 			"\tnodiscard\n"
 			"\tencoding=<encoding>\n"
 			"\tencoding_flags=<flags>\n"
-			"\tquotatype=<quota type(s) to be enabled>\n\n"),
+			"\tquotatype=<quota type(s) to be enabled>\n"
+			"\tassume_storage_prezeroed=<0 to disable, 1 to enable>\n\n"),
 			badopt ? badopt : "");
 		free(buf);
 		exit(1);
@@ -3095,6 +3102,18 @@ int main (int argc, char *argv[])
 		io_channel_set_options(fs->io, opt_string);
 	}
 
+	if (assume_storage_prezeroed) {
+	  if (verbose)
+			printf("%s",
+				       _("Assuming the storage device is prezeroed "
+                         "- skipping inode table and journal wipe\n"));
+
+	  lazy_itable_init = 1;
+	  itable_zeroed = 1;
+	  zero_hugefile = 0;
+	  journal_flags |= EXT2_MKJOURNAL_LAZYINIT;
+	}
+
 	/* Can't undo discard ... */
 	if (!noaction && discard && dev_size && (io_ptr != undo_io_manager)) {
 		retval = mke2fs_discard_device(fs);
-- 
2.31.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ