[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b90d5ba2-69a6-4373-8aea-16eff78a3419@huawei.com>
Date: Wed, 22 Jan 2025 15:31:30 +0800
From: Li Lingfeng <lilingfeng3@...wei.com>
To: NeilBrown <neilb@...e.de>, Jeff Layton <jlayton@...nel.org>
CC: Chuck Lever <chuck.lever@...cle.com>, <okorniev@...hat.com>,
<Dai.Ngo@...cle.com>, <tom@...pey.com>, <linux-nfs@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <yukuai1@...weicloud.com>,
<houtao1@...wei.com>, <yi.zhang@...wei.com>, <yangerkun@...wei.com>,
<lilingfeng@...weicloud.com>
Subject: Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list
在 2025/1/22 11:48, NeilBrown 写道:
> On Wed, 22 Jan 2025, NeilBrown wrote:
>> On Wed, 22 Jan 2025, Jeff Layton wrote:
>>> To be clear, I think we need to drop e57420be100ab from your nfsd-
>>> testing branch. The race I identified above is quite likely to occur
>>> and could lead to leaks.
>>>
>>> If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I
>>> think the RCU approach is safe.
>> I'm not convinced this is the right approach.
>> I cannot see how nfsd_file_put() can race with unhashing. If it cannot
>> then we can simply unconditionally call nfsd_file_schedule_laundrette().
>>
>> Can describe how the race can happen - if indeed it can.
> I thought I should explore this more and explain what I think actually
> happens ...
>
> Certainly nfsd_file_unhash() might race with nfsd_file_put(). At this
> point in nfsd_file_put() we have the only reference but a hash lookup
> could gain another reference and the immediately unhash it.
> nfsd_file_queue_for_close() can do this. There might be other paths.
>
> But why does this mean we need to remove it from the lru and free it
> immediately? If we leave it on the lru it will be freed in a couple of
> seconds.
>
> The reason might be nfsd_file_close_inode_sync(). This needs to close
> files before returning.
> But if nfsd_file_close_inode_sync() is called while some other thread
> holds a reference to the file and might want to call nfsd_file_put(),
> then it isn't going to succeed anyway so any race here doesn't make any
> difference.
>
> So I think the following might be the best fix
Agree.
I ignored the trace point in nfsd_file_lru_add(), this one looks better.
And I have tested it.
Thanks.
>
> ???
>
> Thanks,
> NeilBrown
>
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index fcd751cb7c76..773788a50e56 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -322,10 +322,13 @@ nfsd_file_check_writeback(struct nfsd_file *nf)
> static bool nfsd_file_lru_add(struct nfsd_file *nf)
> {
> set_bit(NFSD_FILE_REFERENCED, &nf->nf_flags);
> + rcu_read_lock();
> if (list_lru_add_obj(&nfsd_file_lru, &nf->nf_lru)) {
> trace_nfsd_file_lru_add(nf);
> + rcu_read_unlock();
> return true;
> }
> + rcu_read_unlock();
> return false;
> }
>
> @@ -371,19 +374,8 @@ nfsd_file_put(struct nfsd_file *nf)
>
> /* Try to add it to the LRU. If that fails, decrement. */
> if (nfsd_file_lru_add(nf)) {
> - /* If it's still hashed, we're done */
> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
> - nfsd_file_schedule_laundrette();
> - return;
> - }
> -
> - /*
> - * We're racing with unhashing, so try to remove it from
> - * the LRU. If removal fails, then someone else already
> - * has our reference.
> - */
> - if (!nfsd_file_lru_remove(nf))
> - return;
> + nfsd_file_schedule_laundrette();
> + return;
> }
> }
> if (refcount_dec_and_test(&nf->nf_ref))
>
>
Powered by blists - more mailing lists