lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7795ed72-8b77-14ea-cf18-78870e58f429@suse.com>
Date:   Thu, 26 Aug 2021 16:45:21 +0800
From:   Gang He <ghe@...e.com>
To:     Joseph Qi <joseph.qi@...ux.alibaba.com>, mark@...heh.com,
        jlbec@...lplan.org
Cc:     linux-kernel@...r.kernel.org, ocfs2-devel@....oracle.com,
        akpm@...ux-foundation.org
Subject: Re: [PATCH] ocfs2: ocfs2_downconvert_lock failure results in deadlock

Hi Joseph,

On 2021/8/26 16:23, Joseph Qi wrote:
> 
> 
> On 8/26/21 2:10 PM, Gang He wrote:
>> Usually, ocfs2_downconvert_lock() function always downconverts
>> dlm lock to the expected level for satisfy dlm bast requests
> 
> s/for/to
> 
>> from the other nodes.
>> But there is a rare situation. When dlm lock conversion is being
>> canceled, ocfs2_downconvert_lock() function will return -EBUSY.
>> You need to be aware that ocfs2_cancel_convert() function is
>> asynchronous in fsdlm implementation.
>> If we does not requeue this lockres entry, ocfs2 downconvert
>> thread no longer handles this dlm lock bast request. Then, the
>> other nodes will not get the dlm lock again, the current node's
>> process will be blocked when acquire this dlm lock again.
>>
>> Signed-off-by: Gang He <ghe@...e.com>
>> ---
>>   fs/ocfs2/dlmglue.c | 13 +++++++++----
>>   1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
>> index 48fd369c29a4..c454c218fbfe 100644
>> --- a/fs/ocfs2/dlmglue.c
>> +++ b/fs/ocfs2/dlmglue.c
>> @@ -3671,13 +3671,11 @@ static int ocfs2_downconvert_lock(struct ocfs2_super *osb,
>>   			     OCFS2_LOCK_ID_MAX_LEN - 1);
>>   	lockres_clear_pending(lockres, generation, osb);
>>   	if (ret) {
>> -		ocfs2_log_dlm_error("ocfs2_dlm_lock", ret, lockres);
>> +		if (ret != -EBUSY)
>> +			ocfs2_log_dlm_error("ocfs2_dlm_lock", ret, lockres);
> 
> Do we have to treat EBUSY as a normal case?
Yes, this return code is expected when call dlm_lock() to convert a dlm 
lock to another level, but this dlm lock is being cancelled.
As I said in another mail, for fsdlm implementation,ocfs2_cancel_convert
will return immediately, but the related dlm lock will(is) be cancelled 
in background. For o2dlm implementation,ocfs2_cancel_convert will return 
after the dlm lock is cancelled and it's ast is invoked, that is why 
o2cb based ocfs2 does not encounter -EBUSY return code in my test 
script, of course, this kind of implementation will block other lockres 
entries to be handled for a little time in down-convert thread.

> 
>>   		ocfs2_recover_from_dlm_error(lockres, 1);
>> -		goto bail;
>>   	}
>>   
>> -	ret = 0;
>> -bail:
>>   	return ret;
>>   }
>>   
>> @@ -3912,6 +3910,13 @@ static int ocfs2_unblock_lock(struct ocfs2_super *osb,
>>   	spin_unlock_irqrestore(&lockres->l_lock, flags);
>>   	ret = ocfs2_downconvert_lock(osb, lockres, new_level, set_lvb,
>>   				     gen);
>> +	/* ocfs2_cancel_convert() is in progress, try again later */
>> +	if (ret == -EBUSY) {
>> +		ctl->requeue = 1;
>> +		mlog(ML_BASTS, "lockres %s, ReQ: Downconvert busy\n",
>> +		     lockres->l_name);
>> +		ret = 0;
> 
> Ditto. If EBUSY is not a normal case, I'd like just requeue it but not
> change it to normal return code.
> You know ML_BASTS is always switched off in production environment.
Since this case should be considered as a normal case, although it's rare.
We should not print any error message to kernel journal, but if the user
turn on the BASTS trace, he should watch this trace for debugging.

Thanks
Gang

> 
> Thanks,
> Joseph
> 
>> +	}
>>   
>>   leave:
>>   	if (ret)
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ