[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b1333f15-fb3d-5698-1852-47a55546bdb8@redhat.com>
Date: Mon, 7 Nov 2022 20:44:24 +0800
From: Xiubo Li <xiubli@...hat.com>
To: Jeff Layton <jlayton@...nel.org>, viro@...iv.linux.org.uk,
chuck.lever@...cle.com
Cc: axboe@...nel.dk, asml.silence@...il.com,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
io-uring@...r.kernel.org, ceph-devel@...r.kernel.org,
mchangir@...hat.com, idryomov@...il.com, lhenriques@...e.de,
gfarnum@...hat.com
Subject: Re: [RFC PATCH] fs/lock: increase the filp's reference for
Posix-style locks
On 07/11/2022 20:29, Jeff Layton wrote:
> On Mon, 2022-11-07 at 20:03 +0800, Xiubo Li wrote:
>> On 07/11/2022 18:33, Jeff Layton wrote:
>>> On Mon, 2022-11-07 at 17:52 +0800, xiubli@...hat.com wrote:
[...]
>>>> diff --git a/io_uring/openclose.c b/io_uring/openclose.c
>>>> index 67178e4bb282..5a12cdf7f8d0 100644
>>>> --- a/io_uring/openclose.c
>>>> +++ b/io_uring/openclose.c
>>>> @@ -212,6 +212,7 @@ int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>> int io_close(struct io_kiocb *req, unsigned int issue_flags)
>>>> {
>>>> struct files_struct *files = current->files;
>>>> + fl_owner_t owner = file_lock_make_thread_owner(files);
>>>> struct io_close *close = io_kiocb_to_cmd(req, struct io_close);
>>>> struct fdtable *fdt;
>>>> struct file *file;
>>>> @@ -247,7 +248,7 @@ int io_close(struct io_kiocb *req, unsigned int issue_flags)
>>>> goto err;
>>>>
>>>> /* No ->flush() or already async, safely close from here */
>>>> - ret = filp_close(file, current->files);
>>>> + ret = filp_close(file, owner);
>>>> err:
>>>> if (ret < 0)
>>>> req_set_fail(req);
>>> I think this is the wrong approach to fixing this. It also looks like
>>> you could hit a similar problem with OFD locks and this patch wouldn't
>>> address that issue.
>> For the OFD locks they will set the 'file' struct as the owner just as
>> the flock does, it should be okay and I don't think it has this issue if
>> my understanding is correct here.
>>
> They set the the owner to "file", but they don't hold a reference to it.
> With OFD locks, the file is what holds references to the lock, not the
> reverse.
Yeah, right. But for both OFD and flock they shouldn't hit this issue,
because it when removing all the locks having the same owner, which is
the 'file', passed by filp_close(filp), the 'file' reference counter
must be larger than 0. Because the filp_close() is still using it.
This is why using the thread id as the owner is a special case for
Posix-style lock.
>
>>> The real bug seems to be that ceph_fl_release_lock dereferences fl_file,
>>> at a point when it shouldn't rely on that being valid. Most filesystems
>>> stash some info in fl->fl_u if they need to do bookkeeping after
>>> releasing a lock. Perhaps ceph should be doing something similar?
>> This is the 'filp' memory in filp_close(filp, ...):
>>
>> crash> file.f_path.dentry,f_inode 0xffff952d7ab46200
>> f_path.dentry = 0xffff9521b121cb40
>> f_inode = 0xffff951f3ea33550,
>>
>> We can see the 'f_inode' is pointing to the correct inode memory.
>>
>>
>>
>> While later in 'ceph_fl_release_lock()':
>>
>> 41 static void ceph_fl_release_lock(struct file_lock *fl)
>> 42 {
>> 43 struct ceph_file_info *fi = fl->fl_file->private_data;
>> 44 struct inode *inode = file_inode(fl->fl_file);
>> 45 struct ceph_inode_info *ci = ceph_inode(inode);
>> 46 atomic_dec(&fi->num_locks);
>> 47 if (atomic_dec_and_test(&ci->i_filelock_ref)) {
>> 48 /* clear error when all locks are released */
>> 49 spin_lock(&ci->i_ceph_lock);
>> 50 ci->i_ceph_flags &= ~CEPH_I_ERROR_FILELOCK;
>> 51 spin_unlock(&ci->i_ceph_lock);
>> 52 }
>> 53 }
>>
> You only need the inode for most of this. The exception is
> fi->num_locks, so you may need to test for that in a different way.
>
>> It crashed in Line#47 and the 'fl->fl_file' memory is:
>>
>> crash> file.f_path.dentry,f_inode 0xffff952d4ebd8a00
>> f_path.dentry = 0x0
>> f_inode = 0x0,
>>
>> Please NOTE: the 'filp' and 'fl->fl_file' are two different 'file struct'.
>>
> Yep, I understand the bug. I just don't like the proposed fix. :)
Yeah, I also think this approach is ugly :-)
>> Can we fix this by using 'fl->fl_u' here ?
>>
> Probably. You could take and hold an inode reference in there, and maybe
> add a function that looks at whether there are any locks held against a
> particular file, rather than trying to count locks in ceph_file_info.
Okay, this sounds good.
Let me try this tomorrow.
>> I was also thinking I could just call the 'get_file(file)' in
>> ceph_lock() and then in ceph_fl_release_lock() release the reference
>> counter. How about this ?
>>
> That may work too, though again, I'd be worried about cyclical
> dependencies, particularly with OFD locks. If the lock holds a reference
> to the file, then can the file's refcount ever go to zero if the lock is
> never explicitly released? I think not.
>
> You may also need to consider flock locks too, since they have similar
> ownership semantics to OFD locks.
I will send a V2 later.
Thanks Jeff!
- Xiubo
Powered by blists - more mailing lists