[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR04MB58165F4D55724B03EDED8E0DE71C0@BYAPR04MB5816.namprd04.prod.outlook.com>
Date: Fri, 7 Feb 2020 02:29:37 +0000
From: Damien Le Moal <Damien.LeMoal@....com>
To: Dave Chinner <david@...morbit.com>
CC: "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Johannes Thumshirn <jth@...nel.org>,
Naohiro Aota <Naohiro.Aota@....com>,
"Darrick J . Wong" <darrick.wong@...cle.com>,
Hannes Reinecke <hare@...e.de>
Subject: Re: [PATCH v12 1/2] fs: New zonefs file system
On 2020/02/07 9:29, Dave Chinner wrote:
> On Thu, Feb 06, 2020 at 02:26:30PM +0900, Damien Le Moal wrote:
>> zonefs is a very simple file system exposing each zone of a zoned block
>> device as a file. Unlike a regular file system with zoned block device
>> support (e.g. f2fs), zonefs does not hide the sequential write
>> constraint of zoned block devices to the user. Files representing
>> sequential write zones of the device must be written sequentially
>> starting from the end of the file (append only writes).
>
> ....
>> + if (flags & IOMAP_WRITE)
>> + length = zi->i_max_size - offset;
>> + else
>> + length = min(length, isize - offset);
>> + mutex_unlock(&zi->i_truncate_mutex);
>> +
>> + iomap->offset = offset & (~sbi->s_blocksize_mask);
>> + iomap->length = ((offset + length + sbi->s_blocksize_mask) &
>> + (~sbi->s_blocksize_mask)) - iomap->offset;
>
> iomap->length = __ALIGN_MASK(offset + length, sbi->s_blocksize_mask) -
> iomap->offset;
>
> or it could just use ALIGN(..., sb->s_blocksize) and not need
> pre-calculation of the mask value...
Yes, that is cleaner. Fixed.
>> +static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
>> +{
>> + struct inode *inode = file_inode(iocb->ki_filp);
>> + struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb);
>> + struct zonefs_inode_info *zi = ZONEFS_I(inode);
>> + size_t count;
>> + ssize_t ret;
>> +
>> + /*
>> + * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT
>> + * as this can cause write reordering (e.g. the first aio gets EAGAIN
>> + * on the inode lock but the second goes through but is now unaligned).
>> + */
>> + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb)
>> + && (iocb->ki_flags & IOCB_NOWAIT))
>> + iocb->ki_flags &= ~IOCB_NOWAIT;
>
> Hmmm. I'm wondering if it would be better to return -EOPNOTSUPP here
> so that the application knows it can't do non-blocking write AIO to
> this file.
I wondered the same too. In the end, I decided to go with silently ignoring
the flag (for now) since raw block device accesses do the same (the NOWAIT
support is not complete and IOs may wait on free tags). I have an idea for
fixing simply the out-of-order issuing that may result from using nowait. I
will send a patch for that later and can then remove this.
But if zonefs does not make it to 5.6 (looking really tight), I will send
add that patch to a new zonefs series rebased for 5.7.
>
> Everything else looks OK to me.
Thanks !
>
> Cheers,
>
> Dave.
>
--
Damien Le Moal
Western Digital Research
Powered by blists - more mailing lists