lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 17 Apr 2023 13:01:49 +0200
From:   Jan Kara <jack@...e.cz>
To:     "Ritesh Harjani (IBM)" <ritesh.list@...il.com>
Cc:     linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
        Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@...radead.org>,
        "Darrick J . Wong" <djwong@...nel.org>,
        Ojaswin Mujoo <ojaswin@...ux.ibm.com>,
        Disha Goel <disgoel@...ux.ibm.com>,
        Christoph Hellwig <hch@....de>
Subject: Re: [PATCHv5 2/9] fs/buffer.c: Add generic_buffer_fsync
 implementation

On Sun 16-04-23 15:38:37, Ritesh Harjani (IBM) wrote:
> Some of the higher layers like iomap takes inode_lock() when calling
> generic_write_sync().
> Also writeback already happens from other paths without inode lock,
> so it's difficult to say that we really need sync_mapping_buffers() to
> take any inode locking here. Having said that, let's add
> generic_buffer_fsync() implementation in buffer.c with no
> inode_lock/unlock() for now so that filesystems like ext2 and
> ext4's nojournal mode can use it.
> 
> Ext4 when got converted to iomap for direct-io already copied it's own
> variant of __generic_file_fsync() without lock. Hence let's add a helper
> API and use it both in ext2 and ext4.
> 
> Later we can review other filesystems as well to see if we can make
> generic_buffer_fsync() which does not take any inode_lock() as the
> default path.
> 
> Tested-by: Disha Goel <disgoel@...ux.ibm.com>
> Reviewed-by: Christoph Hellwig <hch@....de>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@...il.com>

There is a problem with generic_buffer_fsync() that it does not call
blkdev_issue_flush() so the caller is responsible for doing that. That's
necessary for ext2 & ext4 so fine for now. But historically this was the
case with generic_file_fsync() as well and that led to many filesystem
forgetting to flush caches from fsync(2). What is our transition plan for
these filesystems that currently do the cache flush from
generic_file_fsync()? Do we want to eventually keep generic_file_fsync()
doing the cache flush and call generic_buffer_fsync() instead of
__generic_buffer_fsync() from it?

								Honza

> ---
>  fs/buffer.c                 | 43 +++++++++++++++++++++++++++++++++++++
>  include/linux/buffer_head.h |  2 ++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 9e1e2add541e..df98f1966a71 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -593,6 +593,49 @@ int sync_mapping_buffers(struct address_space *mapping)
>  }
>  EXPORT_SYMBOL(sync_mapping_buffers);
>  
> +/**
> + * generic_buffer_fsync - generic buffer fsync implementation
> + * for simple filesystems with no inode lock
> + *
> + * @file:	file to synchronize
> + * @start:	start offset in bytes
> + * @end:	end offset in bytes (inclusive)
> + * @datasync:	only synchronize essential metadata if true
> + *
> + * This is a generic implementation of the fsync method for simple
> + * filesystems which track all non-inode metadata in the buffers list
> + * hanging off the address_space structure.
> + */
> +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
> +			 bool datasync)
> +{
> +	struct inode *inode = file->f_mapping->host;
> +	int err;
> +	int ret;
> +
> +	err = file_write_and_wait_range(file, start, end);
> +	if (err)
> +		return err;
> +
> +	ret = sync_mapping_buffers(inode->i_mapping);
> +	if (!(inode->i_state & I_DIRTY_ALL))
> +		goto out;
> +	if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
> +		goto out;
> +
> +	err = sync_inode_metadata(inode, 1);
> +	if (ret == 0)
> +		ret = err;
> +
> +out:
> +	/* check and advance again to catch errors after syncing out buffers */
> +	err = file_check_and_advance_wb_err(file);
> +	if (ret == 0)
> +		ret = err;
> +	return ret;
> +}
> +EXPORT_SYMBOL(generic_buffer_fsync);
> +
>  /*
>   * Called when we've recently written block `bblock', and it is known that
>   * `bblock' was for a buffer_boundary() buffer.  This means that the block at
> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
> index 8f14dca5fed7..3170d0792d52 100644
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -211,6 +211,8 @@ int inode_has_buffers(struct inode *);
>  void invalidate_inode_buffers(struct inode *);
>  int remove_inode_buffers(struct inode *inode);
>  int sync_mapping_buffers(struct address_space *mapping);
> +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
> +			 bool datasync);
>  void clean_bdev_aliases(struct block_device *bdev, sector_t block,
>  			sector_t len);
>  static inline void clean_bdev_bh_alias(struct buffer_head *bh)
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists