[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJfpegtcHW8AwjfjDSm8Y7OXbesrw=ZpX-CMujJ=1Zz_Ly2FdQ@mail.gmail.com>
Date: Fri, 30 Sep 2022 15:35:16 +0200
From: Miklos Szeredi <miklos@...redi.hu>
To: Tycho Andersen <tycho@...ho.pizza>
Cc: Eric Biederman <ebiederm@...ssion.com>,
"Serge E. Hallyn" <serge@...lyn.com>, linux-kernel@...r.kernel.org,
fuse-devel@...ts.sourceforge.net
Subject: Re: [PATCH v2] fuse: In fuse_flush only wait if someone wants the
return code
On Thu, 29 Sept 2022 at 18:40, Tycho Andersen <tycho@...ho.pizza> wrote:
>
> If a fuse filesystem is mounted inside a container, there is a problem
> during pid namespace destruction. The scenario is:
>
> 1. task (a thread in the fuse server, with a fuse file open) starts
> exiting, does exit_signals(), goes into fuse_flush() -> wait
Can't the same happen through
fuse_flush -> fuse_sync_writes -> fuse_set_nowrite -> wait
?
> 2. fuse daemon gets killed, tries to wake everyone up
> 3. task from 1 is stuck because complete_signal() doesn't wake it up, since
> it has PF_EXITING.
>
> The result is that the thread will never be woken up, and pid namespace
> destruction will block indefinitely.
>
> To add insult to injury, nobody is waiting for these return codes, since
> the pid namespace is being destroyed.
>
> To fix this, let's not block on flush operations when the current task has
> PF_EXITING.
>
> This does change the semantics slightly: the wait here is for posix locks
> to be unlocked, so the task will exit before things are unlocked. To quote
> Miklos: https://lore.kernel.org/all/CAJfpegsTmiO-sKaBLgoVT4WxDXBkRES=HF1YmQN1ES7gfJEJ+w@mail.gmail.com/
>
> > "remote" posix locks are almost never used due to problems like this,
> > so I think it's safe to do this.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
> Signed-off-by: Tycho Andersen <tycho@...ho.pizza>
> Link: https://lore.kernel.org/all/YrShFXRLtRt6T%2Fj+@risky/
> ---
> v2: drop the fuse_flush_async() function and just re-use the already
> prepared args; add a description of the problem+note about posix locks
> ---
> fs/fuse/file.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 05caa2b9272e..20bbe3e1afc7 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -464,6 +464,34 @@ static void fuse_sync_writes(struct inode *inode)
> fuse_release_nowrite(inode);
> }
>
> +struct fuse_flush_args {
> + struct fuse_args args;
> + struct fuse_flush_in inarg;
> + struct inode *inode;
> + struct fuse_file *ff;
> +};
> +
> +static void fuse_flush_end(struct fuse_mount *fm, struct fuse_args *args, int err)
> +{
> + struct fuse_flush_args *fa = container_of(args, typeof(*fa), args);
> +
> + if (err == -ENOSYS) {
> + fm->fc->no_flush = 1;
> + err = 0;
> + }
> +
> + /*
> + * In memory i_blocks is not maintained by fuse, if writeback cache is
> + * enabled, i_blocks from cached attr may not be accurate.
> + */
> + if (!err && fm->fc->writeback_cache)
> + fuse_invalidate_attr_mask(fa->inode, STATX_BLOCKS);
This is still duplicating code, can you please create a helper?
Thanks,
Miklos
Powered by blists - more mailing lists