[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <32e3418b-727e-3018-1b8a-0530608fb34d@suse.de>
Date: Tue, 17 Dec 2019 08:28:33 +0100
From: Hannes Reinecke <hare@...e.de>
To: Damien Le Moal <Damien.LeMoal@....com>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-xfs@...r.kernel.org" <linux-xfs@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Johannes Thumshirn <jth@...nel.org>,
Naohiro Aota <Naohiro.Aota@....com>,
"Darrick J . Wong" <darrick.wong@...cle.com>
Subject: Re: [PATCH 1/2] fs: New zonefs file system
On 12/17/19 1:20 AM, Damien Le Moal wrote:
> On 2019/12/16 17:36, Hannes Reinecke wrote:
> [...]
>>> +static int zonefs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
>>> + unsigned int flags, struct iomap *iomap,
>>> + struct iomap *srcmap)
>>> +{
>>> + struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb);
>>> + struct zonefs_inode_info *zi = ZONEFS_I(inode);
>>> + loff_t max_isize = zi->i_max_size;
>>> + loff_t isize;
>>> +
>>> + /*
>>> + * For sequential zones, enforce direct IO writes. This is already
>>> + * checked when writes are issued, so warn about this here if we
>>> + * get buffered write to a sequential file inode.
>>> + */
>>> + if (WARN_ON_ONCE(zi->i_ztype == ZONEFS_ZTYPE_SEQ &&
>>> + (flags & IOMAP_WRITE) && !(flags & IOMAP_DIRECT)))
>>> + return -EIO;
>>> +
>>> + /*
>>> + * For all zones, all blocks are always mapped. For sequential zones,
>>> + * all blocks after the write pointer (inode size) are always unwritten.
>>> + */
>>> + mutex_lock(&zi->i_truncate_mutex);
>>> + isize = i_size_read(inode);
>>> + if (offset >= isize) {
>>> + length = min(length, max_isize - offset);
>>> + if (zi->i_ztype == ZONEFS_ZTYPE_CNV)
>>> + iomap->type = IOMAP_MAPPED;
>>> + else
>>> + iomap->type = IOMAP_UNWRITTEN;
>>> + } else {
>>> + length = min(length, isize - offset);
>>> + iomap->type = IOMAP_MAPPED;
>>> + }
>>> + mutex_unlock(&zi->i_truncate_mutex);
>>> +
>>> + iomap->offset = offset & (~sbi->s_blocksize_mask);
>>> + iomap->length = ((offset + length + sbi->s_blocksize_mask) &
>>> + (~sbi->s_blocksize_mask)) - iomap->offset;
>>> + iomap->bdev = inode->i_sb->s_bdev;
>>> + iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset;
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static const struct iomap_ops zonefs_iomap_ops = {
>>> + .iomap_begin = zonefs_iomap_begin,
>>> +};
>>> +
>> This probably shows my complete ignorance, but what is the effect on
>> enforcing the direct I/O writes on the pagecache?
>> IE what happens for buffered reads? Will the pages be invalidated when a
>> write has been issued?
>
> Yes, a direct write issued to a file range that has cached pages result
> in these pages to be invalidated. But note that in the case of zonefs,
> this can happen only in the case of conventional zones. For sequential
> zones, this does not happen: reads can be buffered and cache pages but
> only for pages below the write pointer. And writes can only be issued at
> the write pointer. So there is never any possible overlap between
> buffered reads and direct writes.
>
Oh, indeed, you are correct. That's indeed easy then.
>> Or do we simply rely on upper layers to ensure no concurrent buffered
>> and direct I/O is being made?
>
> Nope. VFS, or the file system specific implementation, takes care of
> that. See generic_file_direct_write() and its call to
> invalidate_inode_pages2_range().
>
Of course.
One could even say: not applicable, as it won't happen.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
Powered by blists - more mailing lists