lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.OSX.2.00.1402121654410.60058@scrumpy>
Date:	Wed, 12 Feb 2014 17:00:00 -0700 (MST)
From:	Ross Zwisler <ross.zwisler@...ux.intel.com>
To:	Ross Zwisler <ross.zwisler@...ux.intel.com>
cc:	Matthew Wilcox <matthew.r.wilcox@...el.com>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, linux-ext4@...r.kernel.org, david@...morbit.com
Subject: Re: [PATCH v5 19/22] ext4: Add XIP functionality

On Tue, 11 Feb 2014, Ross Zwisler wrote:
> On Wed, 15 Jan 2014, Matthew Wilcox wrote:
> > From: Ross Zwisler <ross.zwisler@...ux.intel.com>
> > 
> > This is a port of the XIP functionality found in the current version of
> > ext2.
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@...ux.intel.com>
> > Reviewed-by: Andreas Dilger <andreas.dilger@...el.com>
> > [heavily tweaked]
> > Signed-off-by: Matthew Wilcox <matthew.r.wilcox@...el.com>
> 
> ...
> 
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index c767666..8b73d77 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -663,6 +663,18 @@ found:
> >  			WARN_ON(1);
> >  		}
> >  
> > +		/* this is probably wrong for ext4.  unlike ext2, ext4 supports
> > +		 * uninitialised extents, so we should probably be hooking
> > +		 * into the "make it initialised" code instead. */
> > +		if (IS_XIP(inode)) {
> 
> With the very first version of this patch the above logic seemed to work
> correctly, zeroing blocks as we allocated them.  With the current XIP
> infrastructure based tightly on direct IO this ends up being wrong because in
> some cases we can call ext4_map_blocks() twice for a given block.  
> 
> A quick userland test program that creates a new file, truncates it up to 4k
> and then does a partial block write will end up giving you a file filled with
> all zeros.  This is because we zero the data before the write, do the write,
> and then zero again, overwriting the data.  The second call to
> ext4_map_blocks() happens via ext4_ext_direct_IO =>
> ext4_convert_unwritten_extents() => ext4_map_blocks().
> 
> We can know in ext4_map_blocks() that we are being called after a write has
> already completed by looking at the flags.  One solution to get around this
> double-zeroing would be to change the above test to:
> 
> +                 if (IS_XIP(inode) && !(flags & EXT4_GET_BLOCKS_CONVERT)) {
> 
> This fixes the tests I've been able to come up with, but I'm not certain it's
> the correct fix for the long term.  It seems wasteful to zero the blocks we're
> allocating, just to have the zeros overwritten immediately by a write.  Maybe
> a cleaner way would be to try and zero the unwritten bits inside of
> ext4_convert_unwritten_extents(), or somewhere similar?
> 
> It's worth noting that I don't think the direct I/O path has this kind of
> logic because they don't allow partial block writes.  The regular I/O path
> knows to zero unwritten space based on the BH_New flag, as set via the
> set_buffer_new() call in ext4_da_map_blocks().  This is a pretty different I/O
> path, though, so I'm not sure how much we can borrow for the XIP code.
> 
> Thoughts on the correct fix?
> 
> - Ross

It looks like Dave Chinner outlined a way to deal with this in response to the
"[PATCH v5 00/22] Rewrite XIP code and add XIP support to ext4" mail.

I'll try and implement things as Dave has described (zero full blocks in the
case of xip_fault() and mark extents as written, use buffer_new(bh) to zero
edges for normal I/O) and send out code or questions as I have them.

- Ross
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ