lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190924095044.GB17526@bobrowski>
Date:   Tue, 24 Sep 2019 19:50:44 +1000
From:   Matthew Bobrowski <mbobrowski@...browski.org>
To:     Jan Kara <jack@...e.cz>
Cc:     tytso@....edu, adilger.kernel@...ger.ca,
        linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        david@...morbit.com, hch@...radead.org, darrick.wong@...cle.com
Subject: Re: [PATCH v3 2/6] ext4: move inode extension/truncate code out from
 ext4_iomap_end()

On Mon, Sep 23, 2019 at 06:21:15PM +0200, Jan Kara wrote:
> On Thu 12-09-19 21:04:00, Matthew Bobrowski wrote:
> > @@ -233,12 +234,90 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)
> >  	return iov_iter_count(from);
> >  }
> >  
> > +static int ext4_handle_inode_extension(struct inode *inode, loff_t offset,
> > +				       ssize_t len, size_t count)
> 
> Traditionally, we call one of the length arguments 'copied' or 'written' to
> denote actual amount of data processed and the original length is called
> 'len' or 'length' in iomap code. Can you please rename the arguments to
> follow this convention?

Sure, I will go with 'written'.

> > +/*
> > + * The inode may have been placed onto the orphan list or has had
> > + * blocks allocated beyond EOF as a result of an extension. We need to
> > + * ensure that any necessary cleanup routines are performed if the
> > + * error path has been taken for a write.
> > + */
> > +static int ext4_handle_failed_inode_extension(struct inode *inode, loff_t size)
> > +{
> > +	handle_t *handle;
> > +
> > +	if (size > i_size_read(inode))
> > +		ext4_truncate_failed_write(inode);
> > +
> > +	if (!list_empty(&EXT4_I(inode)->i_orphan)) {
> > +		handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
> > +		if (IS_ERR(handle)) {
> > +			if (inode->i_nlink)
> > +				ext4_orphan_del(NULL, inode);
> > +			return PTR_ERR(handle);
> > +		}
> > +		if (inode->i_nlink)
> > +			ext4_orphan_del(handle, inode);
> > +		ext4_journal_stop(handle);
> > +	}
> > +	return 0;
> > +}
> > +
> 
> After some thought, I'd just drop this function and fold the functionality
> into ext4_handle_inode_extension() by making it accept negative 'len'
> argument indicating error. The code just happens to be the simplest in that
> case (see below).
> 
> > @@ -255,7 +334,18 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  	if (ret)
> >  		goto out;
> >  
> > +	offset = iocb->ki_pos;
> >  	ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
> > +	if (ret > 0 && iocb->ki_pos > i_size_read(inode))
> > +		error = ext4_handle_inode_extension(inode, offset, ret,
> > +						    iov_iter_count(from));
> 
> You need to sample iov_iter_count(from) before calling dax_iomap_rw(). At
> this point iov_iter_count(from) is just what's left in the iter after
> writing what we could.
> 
> Also I don't think the condition iocb->ki_pos > i_size_read(inode) is
> correct here. Because it may happen that offset + count > i_size so we
> allocate some blocks beyond i_size but then we manage to copy only less so
> offset + ret == iocb->ki_pos <= i_size and you will not call
> ext4_handle_inode_extension() to truncate allocated blocks beyond i_size.

Yeah, this makes sense. I will implement it this way. Once again, appreciate
your suggestions.
 
> So I'd just call ext4_handle_inode_extension() unconditionally like:
> 
> 	error = ext4_handle_inode_extension(inode, offset, ret, len);
> 
> and have a quick check at the beginning of that function to avoid starting
> transaction when there isn't anything to do. Something like:
> 
> 	/*
> 	 * DIO and DAX writes get exclusion from truncate (i_rwsem) and
> 	 * page writeback (i_rwsem and flushing all dirty pages).
> 	 */
> 	WARN_ON_ONCE(i_size_read(inode) != EXT4_I(inode)->i_disksize);
> 	if (offset + count <= i_size_read(inode))
> 		return 0;
> 	if (len < 0)
> 		goto truncate;
> 
> 	... do the heavylifting with transaction start, inode size update,
> 	and orphan handling...
> 
> 	if (truncate) {
> truncate:
> 		ext4_truncate_failed_write(inode);
> orphan_del:
> 		/*
> 		 * If the truncate operation failed early the inode
> 		 * may still be on the orphan list. In that case, we
> 		 * need try remove the inode from the linked list in
> 		 * memory.
> 		 */
> 		if (inode->i_nlink)
> 			ext4_orphan_del(NULL, inode);
> 	}

--<M>--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ