[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <87o7nmivqm.fsf@doe.com>
Date: Mon, 17 Apr 2023 17:08:57 +0530
From: Ritesh Harjani (IBM) <ritesh.list@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@...radead.org>,
"Darrick J . Wong" <djwong@...nel.org>,
Ojaswin Mujoo <ojaswin@...ux.ibm.com>,
Disha Goel <disgoel@...ux.ibm.com>,
Christoph Hellwig <hch@....de>
Subject: Re: [PATCHv5 2/9] fs/buffer.c: Add generic_buffer_fsync implementation
Jan Kara <jack@...e.cz> writes:
> On Sun 16-04-23 15:38:37, Ritesh Harjani (IBM) wrote:
>> Some of the higher layers like iomap takes inode_lock() when calling
>> generic_write_sync().
>> Also writeback already happens from other paths without inode lock,
>> so it's difficult to say that we really need sync_mapping_buffers() to
>> take any inode locking here. Having said that, let's add
>> generic_buffer_fsync() implementation in buffer.c with no
>> inode_lock/unlock() for now so that filesystems like ext2 and
>> ext4's nojournal mode can use it.
>>
>> Ext4 when got converted to iomap for direct-io already copied it's own
>> variant of __generic_file_fsync() without lock. Hence let's add a helper
>> API and use it both in ext2 and ext4.
>>
>> Later we can review other filesystems as well to see if we can make
>> generic_buffer_fsync() which does not take any inode_lock() as the
>> default path.
>>
>> Tested-by: Disha Goel <disgoel@...ux.ibm.com>
>> Reviewed-by: Christoph Hellwig <hch@....de>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@...il.com>
>
> There is a problem with generic_buffer_fsync() that it does not call
> blkdev_issue_flush() so the caller is responsible for doing that. That's
> necessary for ext2 & ext4 so fine for now. But historically this was the
> case with generic_file_fsync() as well and that led to many filesystem
> forgetting to flush caches from fsync(2).
Ok, thanks for the details.
> What is our transition plan for
> these filesystems that currently do the cache flush from
> generic_file_fsync()? Do we want to eventually keep generic_file_fsync()
> doing the cache flush and call generic_buffer_fsync() instead of
> __generic_buffer_fsync() from it?
Frankly speaking, I was thinking we will come back to this question
maybe when we start working on those changes. At this point in time
I only looked at it from ext2 DIO changes perspective.
But since you asked, here is what I think we could do -
Rename generic_file_fsync => generic_buffers_sync() to fs/buffers.c
Then
generic_buffers_sync() {
ret = generic_buffers_fsync()
if (!ret)
blkdev_issue_flush()
}
generic_buffers_fsync() is same as in this patch which does not have the
cache flush operation.
(will rename from generic_buffer_fsync() to generic_buffers_fsync())
Note: The naming is kept such that-
- sync means it will do fsync followed by cache flush.
- fsync means it will only do the file fsync
As I understand - we would eventually like to kill the
inode_lock() variants of generic_file_fsync() and __generic_file_fsync()
after auditing other filesystem code, right?
Then for now what we need is generic_buffers_sync() function which does
not take an inode_lock() and also does cache flush which is required for ext2.
And generic_buffers_fsync() which does not do any cache flush operations
required by filesystem like ext4.
Does that sound good to you? Is the naming also proper?
Is yes, then I can rename the below function to generic_buffers_fsync()
and also create implementation of generic_buffers_sync().
Then let ext2 and ext4 use them.
-ritesh
>
> Honza
>
>> ---
>> fs/buffer.c | 43 +++++++++++++++++++++++++++++++++++++
>> include/linux/buffer_head.h | 2 ++
>> 2 files changed, 45 insertions(+)
>>
>> diff --git a/fs/buffer.c b/fs/buffer.c
>> index 9e1e2add541e..df98f1966a71 100644
>> --- a/fs/buffer.c
>> +++ b/fs/buffer.c
>> @@ -593,6 +593,49 @@ int sync_mapping_buffers(struct address_space *mapping)
>> }
>> EXPORT_SYMBOL(sync_mapping_buffers);
>>
>> +/**
>> + * generic_buffer_fsync - generic buffer fsync implementation
>> + * for simple filesystems with no inode lock
>> + *
>> + * @file: file to synchronize
>> + * @start: start offset in bytes
>> + * @end: end offset in bytes (inclusive)
>> + * @datasync: only synchronize essential metadata if true
>> + *
>> + * This is a generic implementation of the fsync method for simple
>> + * filesystems which track all non-inode metadata in the buffers list
>> + * hanging off the address_space structure.
>> + */
>> +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
>> + bool datasync)
>> +{
>> + struct inode *inode = file->f_mapping->host;
>> + int err;
>> + int ret;
>> +
>> + err = file_write_and_wait_range(file, start, end);
>> + if (err)
>> + return err;
>> +
>> + ret = sync_mapping_buffers(inode->i_mapping);
>> + if (!(inode->i_state & I_DIRTY_ALL))
>> + goto out;
>> + if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
>> + goto out;
>> +
>> + err = sync_inode_metadata(inode, 1);
>> + if (ret == 0)
>> + ret = err;
>> +
>> +out:
>> + /* check and advance again to catch errors after syncing out buffers */
>> + err = file_check_and_advance_wb_err(file);
>> + if (ret == 0)
>> + ret = err;
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(generic_buffer_fsync);
>> +
>> /*
>> * Called when we've recently written block `bblock', and it is known that
>> * `bblock' was for a buffer_boundary() buffer. This means that the block at
>> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
>> index 8f14dca5fed7..3170d0792d52 100644
>> --- a/include/linux/buffer_head.h
>> +++ b/include/linux/buffer_head.h
>> @@ -211,6 +211,8 @@ int inode_has_buffers(struct inode *);
>> void invalidate_inode_buffers(struct inode *);
>> int remove_inode_buffers(struct inode *inode);
>> int sync_mapping_buffers(struct address_space *mapping);
>> +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
>> + bool datasync);
>> void clean_bdev_aliases(struct block_device *bdev, sector_t block,
>> sector_t len);
>> static inline void clean_bdev_bh_alias(struct buffer_head *bh)
>> --
>> 2.39.2
>>
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
Powered by blists - more mailing lists