[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHH20JVecjRQEPa3q=k8ax3hqt-LGA3P1S-xFFZYxisL6Q@mail.gmail.com>
Date: Wed, 27 Sep 2023 23:06:53 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <brauner@...nel.org>, viro@...iv.linux.org.uk,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v2] vfs: shave work on failed file open
On 9/27/23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> Btw, I think we could get rid of the RCU freeing of 'struct file *'
> entirely.
>
> The way to fix it is
>
> (a) make sure all f_count accesses are atomic ops (the one special
> case is the "0 -> X" initialization, which is ok)
>
> (b) make filp_cachep be SLAB_TYPESAFE_BY_RCU
>
> because then get_file_rcu() can do the atomic_long_inc_not_zero()
> knowing it's still a 'struct file *' while holding the RCU read lock
> even if it was just free'd.
>
> And __fget_files_rcu() will then re-check that it's the *right*
> 'struct file *' and do a fput() on it and re-try if it isn't. End
> result: no need for any RCU freeing.
>
> But the difference is that a *new* 'struct file *' might see a
> temporary atomic increment / decrement of the file pointer because
> another CPU is going through that __fget_files_rcu() dance.
>
I think you attached the wrong file, it has next to no changes and in
particular nothing for fd lookup.
You may find it interesting that both NetBSD and FreeBSD have been
doing something to that extent for years now in order to provide
lockless fd lookup despite not having an equivalent to RCU (what they
did have at the time is "type stable" -- objs can get reused but the
memory can *never* get freed. utterly gross, but that's old Unix for
you).
It does work, but I always found it dodgy because it backpedals in a
way which is not free of side effects.
Note that validating you got the right file bare minimum requires
reloading the fd table pointer because you might have been racing
against close *and* resize.
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists