lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190916223741.GA5936@bobrowski>
Date:   Tue, 17 Sep 2019 08:37:41 +1000
From:   Matthew Bobrowski <mbobrowski@...browski.org>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     tytso@....edu, jack@...e.cz, adilger.kernel@...ger.ca,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        david@...morbit.com, darrick.wong@...cle.com
Subject: Re: [PATCH v3 5/6] ext4: introduce direct IO write path using iomap
 infrastructure

On Mon, Sep 16, 2019 at 05:12:48AM -0700, Christoph Hellwig wrote:
> On Thu, Sep 12, 2019 at 09:04:46PM +1000, Matthew Bobrowski wrote:
> > @@ -213,12 +214,16 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)
> >  	struct inode *inode = file_inode(iocb->ki_filp);
> >  	ssize_t ret;
> >  
> > +	if (unlikely(IS_IMMUTABLE(inode)))
> > +		return -EPERM;
> > +
> >  	ret = generic_write_checks(iocb, from);
> >  	if (ret <= 0)
> >  		return ret;
> >  
> > -	if (unlikely(IS_IMMUTABLE(inode)))
> > -		return -EPERM;
> > +	ret = file_modified(iocb->ki_filp);
> > +	if (ret)
> > +		return 0;
> >  
> >  	/*
> >  	 * If we have encountered a bitmap-format file, the size limit
> 
> Independent of the error return issue you probably want to split
> modifying ext4_write_checks into a separate preparation patch.

Providing that there's no objections to introducing a possible performance
change with this separate preparation patch (overhead of calling
file_remove_privs/file_update_time twice), then I have no issues in doing so.

> > +/*
> > + * For a write that extends the inode size, ext4_dio_write_iter() will
> > + * wait for the write to complete. Consequently, operations performed
> > + * within this function are still covered by the inode_lock(). On
> > + * success, this function returns 0.
> > + */
> > +static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size, int error,
> > +				 unsigned int flags)
> > +{
> > +	int ret;
> > +	loff_t offset = iocb->ki_pos;
> > +	struct inode *inode = file_inode(iocb->ki_filp);
> > +
> > +	if (error) {
> > +		ret = ext4_handle_failed_inode_extension(inode, offset + size);
> > +		return ret ? ret : error;
> > +	}
> 
> Just a personal opinion, but I find the use of the ternary operator
> here a little weird.
> 
> A plain old:
> 
> 	ret = ext4_handle_failed_inode_extension(inode, offset + size);
> 	if (ret)
> 		return ret;
> 	return error;
> 
> flow much easier.

Agree, much cleaner.
 
> > +	if (!inode_trylock(inode)) {
> > +		if (iocb->ki_flags & IOCB_NOWAIT)
> > +			return -EAGAIN;
> > +		inode_lock(inode);
> > +	}
> > +
> > +	if (!ext4_dio_checks(inode)) {
> > +		inode_unlock(inode);
> > +		/*
> > +		 * Fallback to buffered IO if the operation on the
> > +		 * inode is not supported by direct IO.
> > +		 */
> > +		return ext4_buffered_write_iter(iocb, from);
> 
> I think you want to lift the locking into the caller of this function
> so that you don't have to unlock and relock for the buffered write
> fallback.

I don't exactly know what you really mean by "lift the locking into the caller
of this function". I'm interpreting that as moving the inode_unlock()
operation into ext4_buffered_write_iter(), but I can't see how that would be
any different from doing it directly here? Wouldn't this also run the risk of
the locks becoming unbalanced as we'd need to add checks around whether the
resource is being contended? Maybe I'm misunderstanding something here...

> > +	if (offset + count > i_size_read(inode) ||
> > +	    offset + count > EXT4_I(inode)->i_disksize) {
> > +		ext4_update_i_disksize(inode, inode->i_size);
> > +		extend = true;
> 
> Doesn't the ext4_update_i_disksize need to be under an open journal
> handle?

After all, it is a metadata update, which should go through an open journal
handle.

Thank you for the review Christoph!

--<M>--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ