linux-kernel - Re: [PATCH] nbd: fix potential deadlock in nbd_config

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5DE3F444.9080706@redhat.com>
Date:   Sun, 1 Dec 2019 11:11:32 -0600
From:   Mike Christie <mchristi@...hat.com>
To:     Sun Ke <sunke32@...wei.com>, josef@...icpanda.com
Cc:     axboe@...nel.dk, linux-block@...r.kernel.org, nbd@...er.debian.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nbd: fix potential deadlock in nbd_config_put()

On 11/28/2019 04:45 AM, Sun Ke wrote:
> I got a deadlock report from syzkaller:
> 
> [  234.427696] ============================================
> [  234.428327] WARNING: possible recursive locking detected
> [  234.429011] 5.4.0-rc4+ #1 Not tainted
> [  234.429528] --------------------------------------------
> [  234.430162] kworker/u9:0/894 is trying to acquire lock:
> [  234.430911] ffff0000d3aca128 ((wq_completion)knbd0-recv){+.+.}, at: flush_workqueue+0xbc/0xfe8
> [  234.432330]
> [  234.432330] but task is already holding lock:
> [  234.432927] ffff0000d3aca128 ((wq_completion)knbd0-recv){+.+.}, at: process_one_work+0x6a0/0x17e8
> [  234.433983]
> [  234.433983] other info that might help us debug this:
> [  234.434615]  Possible unsafe locking scenario:
> [  234.434615]
> [  234.435263]        CPU0
> [  234.435613]        ----
> [  234.436019]   lock((wq_completion)knbd0-recv);
> [  234.436521]   lock((wq_completion)knbd0-recv);
> [  234.437166]
> [  234.437166]  *** DEADLOCK ***
> [  234.437166]
> [  234.437763]  May be due to missing lock nesting notation
> [  234.437763]
> [  234.438559] 3 locks held by kworker/u9:0/894:
> [  234.439040]  #0: ffff0000d3aca128 ((wq_completion)knbd0-recv){+.+.}, at: process_one_work+0x6a0/0x17e8
> [  234.440185]  #1: ffff0000d344fd50 ((work_completion)(&args->work)){+.+.}, at: process_one_work+0x6a0/0x17e8
> [  234.442209]  #2: ffff0000d723cd78 (&nbd->config_lock){+.+.}, at: refcount_dec_and_mutex_lock+0x5c/0x128
> [  234.443380]
> [  234.443380] stack backtrace:
> [  234.444271] CPU: 3 PID: 894 Comm: kworker/u9:0 Not tainted 5.4.0-rc4+ #1
> [  234.444989] Hardware name: linux,dummy-virt (DT)
> [  234.446077] Workqueue: knbd0-recv recv_work
> [  234.446909] Call trace:
> [  234.447372]  dump_backtrace+0x0/0x358
> [  234.447877]  show_stack+0x28/0x38
> [  234.448347]  dump_stack+0x15c/0x1ec
> [  234.448838]  __lock_acquire+0x12ec/0x2f78
> [  234.449474]  lock_acquire+0x180/0x590
> [  234.450075]  flush_workqueue+0x104/0xfe8
> [  234.450587]  drain_workqueue+0x164/0x390
> [  234.451090]  destroy_workqueue+0x30/0x560
> [  234.451598]  nbd_config_put+0x308/0x700
> [  234.452093]  recv_work+0x198/0x1f0
> [  234.452556]  process_one_work+0x7ac/0x17e8
> [  234.453189]  worker_thread+0x36c/0xb70
> [  234.453788]  kthread+0x2f4/0x378
> [  234.454257]  ret_from_fork+0x10/0x18
> 
> The root cause is recv_work() is the last one to drop the config
> ref and try to destroy the workqueue from inside the work queue.
> 
> There are two ways to fix the bug. The first way is flushing the
> workqueue before dropping the initial refcount and making sure
> recv_work() will not be the last owner of nbd_config. However it
> is hard for ioctl interface. Because nbd_clear_sock_ioctl() may
> not be invoked, so we need to flush the workqueue in nbd_release()
> and that will lead to another deadlock because recv_work can not
> exit from nbd_read_stat() loop.
> 
> The second way is using another work to put nbd_config asynchronously
> for recv_work().
> 

Can we also increment the refcount before we do wait_event_interruptible
in nbd_start_device_ioctl, always flush the workqueue then drop the
refcount after the flush? See compile tested only patch.

We will then also know the recv side has stopped after
nbd_start_device_ioctl has returned, so new NBD_DO_IT calls for the same
device do not race with shutdowns of previous uses.



View attachment "nbd-move-flush.patch" of type "text/x-patch" (934 bytes)