lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc3f20c3-21f8-443b-619f-da7569b37aaf@huaweicloud.com>
Date: Wed, 2 Jul 2025 15:30:52 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Nilay Shroff <nilay@...ux.ibm.com>, Ming Lei <ming.lei@...hat.com>,
 Yu Kuai <yukuai1@...weicloud.com>
Cc: josef@...icpanda.com, axboe@...nel.dk, hch@...radead.org, hare@...e.de,
 linux-block@...r.kernel.org, nbd@...er.debian.org,
 linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
 johnny.chenyi@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH] nbd: fix false lockdep deadlock warning

Hi,

在 2025/07/02 14:22, Nilay Shroff 写道:
> 
> 
> On 7/2/25 8:02 AM, Ming Lei wrote:
>> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2025/07/01 21:28, Nilay Shroff 写道:
>>>>
>>>>
>>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
>>>>> Hi,
>>>>>
>>>>> 在 2025/06/27 19:04, Ming Lei 写道:
>>>>>> I guess the patch in the following link may be simper, both two take
>>>>>> similar approach:
>>>>>>
>>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>>>>
>>>>> I this the above approach has concurrent problems if nbd_start_device
>>>>> concurrent with nbd_start_device:
>>>>>
>>>>> t1:
>>>>> nbd_start_device
>>>>> lock
>>>>> num_connections = 1
>>>>> unlock
>>>>>       t2:
>>>>>       nbd_add_socket
>>>>>       lock
>>>>>       config->num_connections++
>>>>>       unlock
>>>>>           t3:
>>>>>           nbd_start_device
>>>>>           lock
>>>>>           num_connections = 2
>>>>>           unlock
>>>>>           blk_mq_update_nr_hw_queues
>>>>>
>>>>> blk_mq_update_nr_hw_queues
>>>>> //nr_hw_queues updated to 1 before failure
>>>>> return -EINVAL
>>>>>
>>>>
>>>> In the above case, yes I see that t1 would return -EINVAL (as
>>>> config->num_connections doesn't match with num_connections)
>>>> but then t3 would succeed to update nr_hw_queue (as both
>>>> config->num_connections and num_connections set to 2 this
>>>> time). Isn't it? If yes, then the above patch (from Ming)
>>>> seems good.
>>>
>>> Emm, I'm confused, If you agree with the concurrent process, then
>>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
>>> nr_hw_queues back to 1 and return failure.
>>
>> It should be easy to avoid failure by simple retrying.
>>
> Yeah I think retry should be a safe bet here.
> 

I really not sure about the retry, the above is just a scenario that I
think of with a quick review, and there are still many concurrent
scenarios that need to be checked, I'm kind of lost here.

Except nbd_start_device() and nbd_add_socked(), I'm not confident
other context that is synchronized with config_lock is not broken.
However, I'm ok with the bet.

> On another note, synchronizing nbd_start_device and nbd_add_socket
> using nbd->task_setup looks more complex and rather we may use
> nbd->pid to synchronize both. We need to move setting of nbd->pid
> before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
> Then in nbd_add_socket we can evaluate nbd->pid and if it's
> non-NULL then we could assume that either nr_hw_queues update is in
> progress or device has been setup and so return -EBUSY. I think
> anyways updating number of connections once device is configured
> would not be possible, so once nbd_start_device is initiated, we
> shall prevent user adding more connections. If we follow this
> approach then IMO we don't need to add retry discussed above.

It's ok for me to forbit nbd_add_socked after nbd is configured, there
is nowhere to use the added sock. And if there really are other contexts
need to be synchronized, I think nbd->pid can be used as well.

> 
> Thanks,
> --Nilay
> .
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ