lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aF56oVEzTygIOUTN@fedora>
Date: Fri, 27 Jun 2025 19:04:01 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Yu Kuai <yukuai1@...weicloud.com>
Cc: josef@...icpanda.com, axboe@...nel.dk, hch@...radead.org,
	nilay@...ux.ibm.com, hare@...e.de, linux-block@...r.kernel.org,
	nbd@...er.debian.org, linux-kernel@...r.kernel.org,
	yukuai3@...wei.com, yi.zhang@...wei.com, yangerkun@...wei.com,
	johnny.chenyi@...wei.com
Subject: Re: [PATCH] nbd: fix false lockdep deadlock warning

On Fri, Jun 27, 2025 at 05:23:48PM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@...wei.com>
> 
> The deadlock is reported because there are circular dependency:
> 
> t1: disk->open_mutex -> nbd->config_lock
> 
>  blkdev_release
>   bdev_release
>    //lock disk->open_mutex)
>    blkdev_put_whole
>     nbd_release
>      nbd_config_put
>         refcount_dec_and_mutex_lock
>         //lock nbd->config_lock
> 
> t2: nbd->config_lock -> set->update_nr_hwq_lock
> 
>  nbd_genl_connect
>   //lock nbd->config_lock
>   nbd_start_device
>    blk_mq_update_nr_hw_queues
>    //lock set->update_nr_hwq_lock
> 
> t3: set->update_nr_hwq_lock -> disk->open_mutex
> 
>  nbd_dev_remove_work
>   nbd_dev_remove
>    del_gendisk
>     down_read(&set->update_nr_hwq_lock);
>     __del_gendisk
>     mutex_lock(&disk->open_mutex);
> 
> This is false warning because t1 and t2 should be synchronized by
> nbd->refs, and t1 is still holding the reference while t2 is triggered
> when the reference is decreased to 0. However the lock order is broken.
> 
> Fix the problem by breaking the dependency from t2, by calling
> blk_mq_update_nr_hw_queues() outside of nbd internal config_lock, since
> now other context can concurrent with nbd_start_device(), also make sure
> they will still return -EBUSY, the difference is that they will not wait
> for nbd_start_device() to be done.
> 
> Fixes: 98e68f67020c ("block: prevent adding/deleting disk during updating nr_hw_queues")
> Reported-by: syzbot+2bcecf3c38cb3e8fdc8d@...kaller.appspotmail.com
> Closes: https://lore.kernel.org/all/6855034f.a00a0220.137b3.0031.GAE@google.com/
> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
> ---
>  drivers/block/nbd.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 7bdc7eb808ea..d43e8e73aeb3 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1457,10 +1457,13 @@ static void nbd_config_put(struct nbd_device *nbd)
>  	}
>  }
>  
> -static int nbd_start_device(struct nbd_device *nbd)
> +static int nbd_start_device(struct nbd_device *nbd, bool netlink)
> +	__releases(&nbd->config_lock)
> +	__acquires(&nbd->config_lock)
>  {
>  	struct nbd_config *config = nbd->config;
>  	int num_connections = config->num_connections;
> +	struct task_struct *old;
>  	int error = 0, i;
>  
>  	if (nbd->pid)
> @@ -1473,8 +1476,21 @@ static int nbd_start_device(struct nbd_device *nbd)
>  		return -EINVAL;
>  	}
>  
> -	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
> +	/*
> +	 * synchronize with concurrent nbd_start_device() and
> +	 * nbd_add_socket()
> +	 */
>  	nbd->pid = task_pid_nr(current);
> +	if (!netlink) {
> +		old = nbd->task_setup;
> +		nbd->task_setup = current;
> +	}
> +
> +	mutex_unlock(&nbd->config_lock);
> +	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
> +	mutex_lock(&nbd->config_lock);
> +	if (!netlink)
> +		nbd->task_setup = old;

I guess the patch in the following link may be simper, both two take
similar approach:

https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/


thanks,
Ming


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ