linux-kernel - Re: [PATCH 1/1] FUSE: Allow parallel direct writes on the same file

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJfpegvU+y+WRhWrgWfc_7O5BdVi=N+3RUuVdBFFoyYVr9MDeg@mail.gmail.com>
Date:   Thu, 21 Apr 2022 17:22:40 +0200
From:   Miklos Szeredi <miklos@...redi.hu>
To:     Dharmendra Singh <dharamhans87@...il.com>
Cc:     Bernd Schubert <bschubert@....com>, linux-fsdevel@...r.kernel.org,
        fuse-devel <fuse-devel@...ts.sourceforge.net>,
        linux-kernel@...r.kernel.org, Dharmendra Singh <dsingh@....com>
Subject: Re: [PATCH 1/1] FUSE: Allow parallel direct writes on the same file

On Fri, 8 Apr 2022 at 08:18, Dharmendra Singh <dharamhans87@...il.com> wrote:
>
> As of now, in Fuse, direct writes on the same file are serialized
> over inode lock i.e we hold inode lock for the whole duration of
> the write request. This serialization works pretty well for the FUSE
> user space implementations which rely on this inode lock for their
> cache/data integrity etc. But it hurts badly such FUSE implementations
> which has their own ways of mainting data/cache integrity and does not
> use this serialization at all.
>
> This patch allows parallel direct writes on the same file with the
> help of a flag called FOPEN_PARALLEL_WRITES. If this flag is set on
> the file (flag is passed from libfuse to fuse kernel as part of file
> open/create), we do not hold inode lock for the whole duration of the
> request, instead acquire it only to protect updates on certain fields
> of the inode. FUSE implementations which rely on this inode lock can
> continue to do so and this is default behaviour.
>
> Signed-off-by: Dharmendra Singh <dsingh@....com>
> ---
>  fs/fuse/file.c            | 38 ++++++++++++++++++++++++++++++++++----
>  include/uapi/linux/fuse.h |  2 ++
>  2 files changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 37eebfb90500..d3e8f44c1228 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1465,6 +1465,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
>         int err = 0;
>         struct fuse_io_args *ia;
>         unsigned int max_pages;
> +       bool p_write = write &&
> +               (ff->open_flags & FOPEN_PARALLEL_WRITES) ? true : false;
>
>         max_pages = iov_iter_npages(iter, fc->max_pages);
>         ia = fuse_io_alloc(io, max_pages);
> @@ -1472,10 +1474,11 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
>                 return -ENOMEM;
>
>         if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) {
> -               if (!write)
> +               /* Parallel write does not come with inode lock held */
> +               if (!write || p_write)

Probably would be good to add an inode_is_locked() assert in
fuse_sync_writes() to make sure we don't miss cases silently.

>                         inode_lock(inode);
>                 fuse_sync_writes(inode);
> -               if (!write)
> +               if (!write || p_write)
>                         inode_unlock(inode);
>         }
>
> @@ -1568,22 +1571,36 @@ static ssize_t fuse_direct_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  {
>         struct inode *inode = file_inode(iocb->ki_filp);
> +       struct file *file = iocb->ki_filp;
> +       struct fuse_file *ff = file->private_data;
>         struct fuse_io_priv io = FUSE_IO_PRIV_SYNC(iocb);
>         ssize_t res;
> +       bool p_write = ff->open_flags & FOPEN_PARALLEL_WRITES ? true : false;
> +       bool unlock_inode = true;
>
>         /* Don't allow parallel writes to the same file */
>         inode_lock(inode);
>         res = generic_write_checks(iocb, from);

I don't think this needs inode lock.  At least nfs_file_direct_write()
doesn't have it.

What it does have, however is taking the inode lock for shared for the
actual write operation, which is probably something that fuse needs as
well.

Also I worry about size extending writes not holding the inode lock
exclusive.  Would that be a problem in your use case?

>         if (res > 0) {
> +               /* Allow parallel writes on the inode by unlocking it */
> +               if (p_write) {
> +                       inode_unlock(inode);
> +                       unlock_inode = false;
> +               }
>                 if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
>                         res = fuse_direct_IO(iocb, from);
>                 } else {
>                         res = fuse_direct_io(&io, from, &iocb->ki_pos,
>                                              FUSE_DIO_WRITE);
> +                       if (p_write) {
> +                               inode_lock(inode);
> +                               unlock_inode = true;
> +                       }
>                         fuse_write_update_attr(inode, iocb->ki_pos, res);

This doesn't need the inode lock either if the actual write wasn't locked.

>                 }
>         }
> -       inode_unlock(inode);
> +       if (unlock_inode)
> +               inode_unlock(inode);
>
>         return res;
>  }
> @@ -2850,10 +2867,16 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>         size_t count = iov_iter_count(iter), shortened = 0;
>         loff_t offset = iocb->ki_pos;
>         struct fuse_io_priv *io;
> -
> +       bool p_write = (iov_iter_rw(iter) == WRITE &&
> +                       ff->open_flags & FOPEN_PARALLEL_WRITES);
>         pos = offset;
>         inode = file->f_mapping->host;
> +
> +       if (p_write)
> +               inode_lock(inode);
>         i_size = i_size_read(inode);

Neither this.

> +       if (p_write)
> +               inode_unlock(inode);
>
>         if ((iov_iter_rw(iter) == READ) && (offset >= i_size))
>                 return 0;
> @@ -2924,9 +2947,16 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>         kref_put(&io->refcnt, fuse_io_release);
>
>         if (iov_iter_rw(iter) == WRITE) {
> +
> +               if (p_write)
> +                       inode_lock(inode);
> +
>                 fuse_write_update_attr(inode, pos, ret);
>                 if (ret < 0 && offset + count > i_size)
>                         fuse_do_truncate(file);
> +
> +               if (p_write)
> +                       inode_unlock(inode);

Truncation needs the inode lock, though I'm not completely
understanding why this truncation is needed.  But for example here it
is assumed that file size won't change, which means that non-extending
writes should hold inode lock shared and extending writes should hold
inode lock exculsive to meet this assumption.

Thanks,
Miklos