[an error occurred while processing this directive]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

[an error occurred while processing this directive]

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190829080308.GA19156@quack2.suse.cz>
Date:   Thu, 29 Aug 2019 10:03:08 +0200
From:   Jan Kara <jack@...e.cz>
To:     Dave Chinner <david@...morbit.com>
Cc:     Jan Kara <jack@...e.cz>,
        Matthew Bobrowski <mbobrowski@...browski.org>,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        tytso@....edu, riteshh@...ux.ibm.com
Subject: Re: [PATCH 4/5] ext4: introduce direct IO write code path using
 iomap infrastructure

On Thu 29-08-19 08:32:18, Dave Chinner wrote:
> On Wed, Aug 28, 2019 at 10:26:19PM +0200, Jan Kara wrote:
> > On Mon 12-08-19 22:53:26, Matthew Bobrowski wrote:
> > > This patch introduces a new direct IO write code path implementation
> > > that makes use of the iomap infrastructure.
> > > 
> > > All direct IO write operations are now passed from the ->write_iter() callback
> > > to the new function ext4_dio_write_iter(). This function is responsible for
> > > calling into iomap infrastructure via iomap_dio_rw(). Snippets of the direct
> > > IO code from within ext4_file_write_iter(), such as checking whether the IO
> > > request is unaligned asynchronous IO, or whether it will ber overwriting
> > > allocated and initialized blocks has been moved out and into
> > > ext4_dio_write_iter().
> > > 
> > > The block mapping flags that are passed to ext4_map_blocks() from within
> > > ext4_dio_get_block() and friends have effectively been taken out and
> > > introduced within the ext4_iomap_begin(). If ext4_map_blocks() happens to have
> > > instantiated blocks beyond the i_size, then we attempt to place the inode onto
> > > the orphan list. Despite being able to perform i_size extension checking
> > > earlier on in the direct IO code path, it makes most sense to perform this bit
> > > post successful block allocation.
> > > 
> > > The ->end_io() callback ext4_dio_write_end_io() is responsible for removing
> > > the inode from the orphan list and determining if we should truncate a failed
> > > write in the case of an error. We also convert a range of unwritten extents to
> > > written if IOMAP_DIO_UNWRITTEN is set and perform the necessary
> > > i_size/i_disksize extension if the iocb->ki_pos + dio->size > i_size_read(inode).
> > > 
> > > In the instance of a short write, we fallback to buffered IO and complete
> > > whatever is left the 'iter'. Any blocks that may have been allocated in
> > > preparation for direct IO will be reused by buffered IO, so there's no issue
> > > with leaving allocated blocks beyond EOF.
> > > 
> > > Signed-off-by: Matthew Bobrowski <mbobrowski@...browski.org>
> > > ---
> > >  fs/ext4/file.c  | 227 ++++++++++++++++++++++++++++++++++++++++----------------
> > >  fs/ext4/inode.c |  42 +++++++++--
> > >  2 files changed, 199 insertions(+), 70 deletions(-)
> > 
> > Overall this is very nice. Some smaller comments below.
> > 
> > > @@ -235,6 +244,34 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)
> > >  	return iov_iter_count(from);
> > >  }
> > >  
> > > +static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,
> > > +					struct iov_iter *from)
> > > +{
> > > +	ssize_t ret;
> > > +	struct inode *inode = file_inode(iocb->ki_filp);
> > > +
> > > +	if (!inode_trylock(inode)) {
> > > +		if (iocb->ki_flags & IOCB_NOWAIT)
> > > +			return -EOPNOTSUPP;
> > > +		inode_lock(inode);
> > > +	}
> > 
> > Currently there's no support for IOCB_NOWAIT for buffered IO so you can
> > replace this with "inode_lock(inode)".
> 
> IOCB_NOWAIT is supported for buffered reads. It is not supported on
> buffered writes (as yet), so this should return EOPNOTSUPP if
> IOCB_NOWAIT is set, regardless of whether the lock can be grabbed or
> not.

Yeah, right. Thanks for correcting me.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR