lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1203380074.3629.35.camel@localhost.localdomain>
Date:	Mon, 18 Feb 2008 16:14:34 -0800
From:	Mingming Cao <cmm@...ibm.com>
To:	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Cc:	tytso@....edu, linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: When reading from fallocated blocks make sure we
	return zero.

On Sat, 2008-02-16 at 08:53 +0530, Aneesh Kumar K.V wrote:
> On Fri, Feb 15, 2008 at 11:43:04AM -0800, Mingming Cao wrote:
> > On Fri, 2008-02-15 at 23:46 +0530, Aneesh Kumar K.V wrote:
> > > fallocate blocks are considered as sparse area and read from them should
> > > return zero. ext4_ext_get_blocks should return zero for read request.
> > > 
> > 
> > The patch itself looks harmless, but I still don't see how this could
> > fix the problem you described at irc: a write hit a BUG_ON() in
> > fs/buffer.c saying the buffer is not mapped. Could you add more details
> > here?
> 
> Write will take the below call chain
> 
> ext4_write_begin
>   block_write_begin
>     __block_prepare_write
>        ext4_getblock
>          ext4_get_blocks_wrap
> (1)	   ext4_ext_get_blocks with create = 0 return allocated
>        ll_rw_block  if buffer not uptodate.
>          submit_bh
> 	   BUG_ON(!buffer_mapped(bh))
> 
> 
> ext4_ext_get_blocks at (1) should have returned 0. That would cause
> ext4_get_blocks_wrap to again call ext4_ext_get_blocks with create = 1
> and that would have returned us the buffer head which is mapped. This
> would also result in splitting the extent to initialized and
> uninitialized one.
> 

I see what is happening. Your fix seems treated preallocated blocks as
holes equally when get_blocks() with create = 0. This works for the
current case, but now I think the patch has side effect to delayed
allocation.  

How about the following patch?

Regards,
Mingming

ext4: ext4_get_blocks_wrap fix for writing to preallocated
From: Mingming Cao <cmm@...ibm.com>

This patch fixed a issue with wrting to a preallocated blocks.
A write hit a BUG_ON() in fs/buffer.c saying the buffer is not mapped.

On the write path, ext4_get_block_wrap() is called with create=1, but it
will pass create=0 down to the underlying ext4ext_get_blocks()
to do a look up first.  In the preallocation case, ext4_ext_get_blocks()
with create = 0, will return number of blocks pre-allocated and buffer
head unmapped. ext4_get_blocks_wrap() thinks it succeeds too early, without
checking if it needs again call ext4_ext_get_blocks with create = 1
which would do proper handling for writing to preallocated blocks:
split the extent to initialized and uninitialized one and
returns the mapped buffer head.

Treating preallocated blocks as holes equally(i.e. ignoring the number of blocks
pre-allocated and returns 0) when get_blocks() is called with create = 0 is not enough.
ext4_ext_get_blocks() needs to differentiate these two cases for delayed allocation
purpose, as for holes it need to do reservation and prepare for later
delayed allocation, but for pre-allocated blocks it needs skip that work.

It would makes things more clear if we have clear definition of what
get_blocks() return value means.

Similar to ext4_get_blocks_handle(),  the following 
* return > 0, # of blocks already allocated
*          if these are pre-allocated blocks and create = 0
*             buffer head is unmapped
*          otherwise blocks are mapped.
*
* return = 0, if plain look up failed (blocks have not been allocated)
*          buffer head is unmapped
*
* return < 0, error case.

The for the write path, at ext4_ext_get_blocks_wrap(), it could check the
buffer_mapped() status for preallocated extent before quit too early.

Signed-off-by: Mingming Cao <cmm@...ibm.com>

---
 fs/ext4/extents.c |   13 +++++++++++++
 fs/ext4/inode.c   |   41 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 51 insertions(+), 3 deletions(-)

Index: linux-2.6.25-rc2/fs/ext4/extents.c
===================================================================
--- linux-2.6.25-rc2.orig/fs/ext4/extents.c	2008-02-18 15:39:52.000000000 -0800
+++ linux-2.6.25-rc2/fs/ext4/extents.c	2008-02-18 15:43:33.000000000 -0800
@@ -2285,9 +2285,22 @@ out:
 }
 
 /*
+ * Block allocation/map/preallocation routine for extents based files
+ *
+ *
  * Need to be called with
  * down_read(&EXT4_I(inode)->i_data_sem) if not allocating file system block
  * (ie, create is zero). Otherwise down_write(&EXT4_I(inode)->i_data_sem)
+ *
+ * return > 0, number of of blocks already mapped/allocated
+ *          if these are pre-allocated blocks, buffer head is unmapped if
+ *          create = 0 (look up only)
+ *          otherwise blocks are mapped
+ *
+ * return = 0, if plain look up failed (blocks have not been allocated)
+ *          buffer head is unmapped
+ *
+ * return < 0, error case.
  */
 int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 			ext4_lblk_t iblock,
Index: linux-2.6.25-rc2/fs/ext4/inode.c
===================================================================
--- linux-2.6.25-rc2.orig/fs/ext4/inode.c	2008-02-18 15:07:00.000000000 -0800
+++ linux-2.6.25-rc2/fs/ext4/inode.c	2008-02-18 15:43:59.000000000 -0800
@@ -908,6 +908,26 @@ out:
  */
 #define DIO_CREDITS 25
 
+
+/*
+ * ext4 get_block() wrapper function
+ * It first do a look up, returns if the blocks already mapped. Otherwise
+ * it takes the write sem and do block allocation
+ *
+ * If file type is extents based, call with ext4_ext_get_blocks()
+ * Otherwise, call with ext4_get_blocks_handle() to handle indirect mapping
+ * based files
+ *
+ * return > 0, number of of blocks already mapped/allocated
+ *          if these are pre-allocated blocks, buffer head is unmapped if
+ *          create = 0 (look up only)
+ *          otherwise blocks are mapped
+ *
+ * return = 0, if plain look up failed (blocks have not been allocated)
+ *          buffer head is unmapped
+ *
+ * return < 0, error case.
+ */
 int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block,
 			unsigned long max_blocks, struct buffer_head *bh,
 			int create, int extend_disksize)
@@ -926,12 +946,27 @@ int ext4_get_blocks_wrap(handle_t *handl
 				inode, block, max_blocks, bh, 0, 0);
 	}
 	up_read((&EXT4_I(inode)->i_data_sem));
-	if (!create || (retval > 0))
+
+	/* If it is only a block(s) look up */
+	if (!create )
+		return retval;
+
+	/*
+	 * Returns if the blocks have already allocated
+	 *
+	 * Note that if blocks have been preallocated
+	 * ext4_ext_get_block() returns with buffer head unmapped.
+	 * Write to a preallocated space needs to split
+	 * the preallocated extents, thus needs to update
+	 * i_data
+	 */
+	if (retval > 0 && buffer_mapped(bh))
 		return retval;
 
 	/*
-	 * We need to allocate new blocks which will result
-	 * in i_data update
+	 * New blocks and preallocation handling will possiblely result
+	 * in i_data update, take the write sem, and call get_blocks()
+	 * with create = 1
 	 */
 	down_write((&EXT4_I(inode)->i_data_sem));
 	/*


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ