[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f0447b4b-3068-4943-a2a8-782308311cfe@huaweicloud.com>
Date: Wed, 20 Aug 2025 20:08:36 +0800
From: Wang Zhaolong <wangzhaolong@...weicloud.com>
To: Dan Carpenter <dan.carpenter@...aro.org>
Cc: sfrench@...ba.org, pc@...guebit.org, linux-cifs@...r.kernel.org,
samba-technical@...ts.samba.org, linux-kernel@...r.kernel.org,
chengzhihao1@...wei.com, yi.zhang@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH v4] smb: client: Fix mount deadlock by avoiding super
block iteration in DFS reconnect
> On Fri, Aug 15, 2025 at 11:16:18AM +0800, Wang Zhaolong wrote:
>> diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
>> index f65a8a90ba27..37d83aade843 100644
>> --- a/fs/smb/client/dfs.c
>> +++ b/fs/smb/client/dfs.c
>> @@ -429,11 +429,11 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon)
>> tcon, tcon->ses->local_nls);
>> goto out;
>> }
>>
>> sb = cifs_get_dfs_tcon_super(tcon);
>> - if (!IS_ERR(sb))
>> + if (!IS_ERR_OR_NULL(sb))
>> cifs_sb = CIFS_SB(sb);
>>
>
> This is a bad or incomplete fix. When functions return BOTH error
> pointers and NULL it MEANS something. The NULL return in this case
> is a special kind of success.
>
> For example, if you look up a file, then the an error means the
> lookup failed because we're not allowed to have filenames '/' so that's
> -EINVAL or maybe there was an allocation failure so that's -ENOMEM or
> maybe you don't have access to the directory so it's -EPERM. The NULL
> would mean that the lookup succeeded fine, but the file was not found.
>
> Another common use case is "get the LED functions so I can blink
> them". -EPROBE_DEFER means the LED subsystem isn't ready yet, but NULL
> means the administrator has deliberately disabled it. It's not an error
> it's deliberate.
>
> It needs to be documented what the NULL returns *means*. The documentation
> is missing here.
>
> See my blog for more details.
> https://staticthinking.wordpress.com/2022/08/01/mixing-error-pointers-and-null/
>
> regards,
> dan carpenter
Hi Dan,
Thank you for your valuable feedback and the insightful blog post. You're
absolutely right - mixing error pointers and NULL without clear semantics
is problematic.
I've just posted a v5 patch [1] that takes a completely different approach:
- Removes cifs_get_dfs_tcon_super() entirely (no more ERR_PTR/NULL confusion)
- Directly updates DFS mount prepaths without searching through superblocks
- Eliminates the deadlock by avoiding iterate_supers_type() completely
Thank you again for catching this issue - it led me to a much better
solution.
[1] https://lore.kernel.org/all/20250820113435.2319994-1-wangzhaolong@huaweicloud.com/
Best regards,
Wang Zhaolong
Powered by blists - more mailing lists