[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v7p06dgv.fsf@igalia.com>
Date: Thu, 12 Jun 2025 19:07:12 +0100
From: Luis Henriques <luis@...lia.com>
To: Jan Kara <jack@...e.cz>
Cc: Mateusz Guzik <mjguzik@...il.com>, Alexander Viro
<viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-dev@...lia.com
Subject: Re: [PATCH] fs: drop assert in file_seek_cur_needs_f_lock
On Thu, Jun 12 2025, Jan Kara wrote:
> On Thu 12-06-25 15:55:40, Mateusz Guzik wrote:
>> On Thu, Jun 12, 2025 at 10:41:01AM +0100, Luis Henriques wrote:
>> > The assert in function file_seek_cur_needs_f_lock() can be triggered very
>> > easily because, as Jan Kara suggested, the file reference may get
>> > incremented after checking it with fdget_pos().
>> >
>> > Fixes: da06e3c51794 ("fs: don't needlessly acquire f_lock")
>> > Signed-off-by: Luis Henriques <luis@...lia.com>
>> > ---
>> > Hi Christian,
>> >
>> > It wasn't clear whether you'd be queueing this fix yourself. Since I don't
>> > see it on vfs.git, I decided to explicitly send the patch so that it doesn't
>> > slip through the cracks.
>> >
>> > Cheers,
>> > --
>> > Luis
>> >
>> > fs/file.c | 2 --
>> > 1 file changed, 2 deletions(-)
>> >
>> > diff --git a/fs/file.c b/fs/file.c
>> > index 3a3146664cf3..075f07bdc977 100644
>> > --- a/fs/file.c
>> > +++ b/fs/file.c
>> > @@ -1198,8 +1198,6 @@ bool file_seek_cur_needs_f_lock(struct file *file)
>> > if (!(file->f_mode & FMODE_ATOMIC_POS) && !file->f_op->iterate_shared)
>> > return false;
>> >
>> > - VFS_WARN_ON_ONCE((file_count(file) > 1) &&
>> > - !mutex_is_locked(&file->f_pos_lock));
>> > return true;
>> > }
>>
>> fdget_pos() can only legally skip locking if it determines to be in
>> position where nobody else can operate on the same file obj, meaning
>> file_count(file) == 1 and it can't go up. Otherwise the lock is taken.
>>
>> Or to put it differently, fdget_pos() NOT taking the lock and new refs
>> showing up later is a bug.
>
> I mostly agree and as I've checked again, this indeed seems to be the case
> as fdget() will increment f_ref if file table is shared with another thread
> and thus file_needs_f_pos_lock() returns true whenever there are more
> threads sharing the file table or if the struct file is dupped to another
> fd. That being said I find the assertion in file_seek_cur_needs_f_lock()
> misplaced - it just doesn't make sense in that place to me.
>
>> I don't believe anything of the sort is happening here.
>>
>> Instead, overlayfs is playing games and *NOT* going through fdget_pos():
>>
>> ovl_inode_lock(inode);
>> realfile = ovl_real_file(file);
>> [..]
>> ret = vfs_llseek(realfile, offset, whence);
>>
>> Given the custom inode locking around the call, it may be any other
>> locking is unnecessary and the code happens to be correct despite the
>> splat.
>
> Right and good spotting. That's indeed more likely explanation than mine.
> Actually custom locking around llseek isn't all that uncommon (mostly for
> historical reasons AFAIK but that's another story).
>
>> I think the safest way out with some future-proofing is to in fact *add*
>> the locking in ovl_llseek() to shut up the assert -- personally I find
>> it uneasy there is some underlying file obj flying around.
>
> Well, if you grep for vfs_llseek(), you'll see there are much more calls to
> it in the kernel than overlayfs. These callers outside of fs/read_write.c
> are responsible for their locking. So I don't think keeping the assert in
> file_seek_cur_needs_f_lock() makes any sense. If anything I'd be open to
> putting it in fdput_pos() or something like that.
Thank you Mateusz and Honza for looking into this. Overlayfs was indeed
my initial suspect, but I had two reasons for thinking that the assert was
the problem: 1) that code was there for quite some time and 2) nobody else
was reporting this issue.
>> Even if ultimately the assert has to go, the proposed commit message
>> does not justify it.
>
> I guess the commit message could be improved. Something like:
>
> The assert in function file_seek_cur_needs_f_lock() can be triggered very
> easily because there are many users of vfs_llseek() (such as overlayfs)
> that do their custom locking around llseek instead of relying on
> fdget_pos(). Just drop the overzealous assertion.
Thanks, makes more sense.
Christian, do you prefer me to resend the patch or is it easier for you to
just amend the commit? (Though, to be fair, the authorship could also be
changed as I mostly reported the issue and tested!)
Cheers,
--
Luís
Powered by blists - more mailing lists