linux-kernel - Re: nbd locking problems

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110128211825.GA20707@mail.hallyn.com>
Date:	Fri, 28 Jan 2011 21:18:26 +0000
From:	"Serge E. Hallyn" <serge@...lyn.com>
To:	Soren Hansen <soren@...ux2go.dk>
Cc:	linux-kernel@...r.kernel.org, Dave Hansen <haveblue@...ibm.com>
Subject: Re: nbd locking problems

Quoting Soren Hansen (soren@...ux2go.dk):
> I'm seeing the following lockup on a current kernel:
> 
> [117360.880311] INFO: task qemu-nbd:16115 blocked for more than 120 seconds.
> [117360.880319] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [117360.880326] qemu-nbd      D 0000000000000001     0 16115  16114 0x00000004
> [117360.880337]  ffff88007d775d98 0000000000000082 ffff88007d775fd8
> ffff88007d774000
> [117360.880348]  0000000000013a80 ffff8800020347e0 ffff88007d775fd8
> 0000000000013a80
> [117360.880359]  ffff880133730000 ffff880002034440 ffffea0004333db8
> ffffffffa071c020
> [117360.880369] Call Trace:
> [117360.880389]  [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180
> [117360.880400]  [<ffffffff81132f13>] ? handle_mm_fault+0x1c3/0x410
> [117360.880408]  [<ffffffff815b93eb>] mutex_lock+0x2b/0x50
> [117360.880419]  [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd]
> [117360.880429]  [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730
> [117360.880438]  [<ffffffff811967a1>] block_ioctl+0x41/0x50
> [117360.880447]  [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370
> [117360.880455]  [<ffffffff8116fa83>] ? putname+0x33/0x50
> [117360.880463]  [<ffffffff81175f61>] sys_ioctl+0x81/0xa0
> [117360.880473]  [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b
> 
> It seems to be caused by 2a48fc0ab24241755dc93bfd4f01d68efab47f5a which
> replaced lock_kernel() with mutex operations.
> 
> In my attempts to debug this problem, I reverted that commit.
> 
> There's a printk statement that logs each ioctl call right before the
> lock_kernel call. I extended it to log the pid of the caller, and added
> a similar printk statement after unlock_kernel.
> 
> This worked for me, and I saw something like this in my dmesg:
> 
> [122322.650443] nbd9: in: nbd_ioctl pid=(24396) cmd=set-blksize(0xab01) arg=1024
> [122322.650454] nbd9: out: nbd_ioctl pid=(24396)
> cmd=set-blksize(0xab01) arg=1024
> [122322.650462] nbd9: in: nbd_ioctl pid=(24396)
> cmd=set-size-blocks(0xab07) arg=153600
> [122322.650469] nbd9: out: nbd_ioctl pid=(24396)
> cmd=set-size-blocks(0xab07) arg=153600
> [122322.650476] nbd9: in: nbd_ioctl pid=(24396) cmd=clear-sock(0xab04)
> arg=153600
> [122322.650483] nbd9: out: nbd_ioctl pid=(24396)
> cmd=clear-sock(0xab04) arg=153600
> [122322.650490] nbd9: in: nbd_ioctl pid=(24396) cmd=set-sock(0xab00) arg=3
> [122322.650497] nbd9: out: nbd_ioctl pid=(24396) cmd=set-sock(0xab00) arg=3
> [122322.651178] nbd9: in: nbd_ioctl pid=(24396) cmd=do-it(0xab03)
> arg=139925697560832
> [122322.652454]  nbd9: unknown partition table
> [122323.070522] nbd9: in: nbd_ioctl pid=(24427) cmd=clear-que(0xab05)
> arg=140267460190640
> [122323.070527] nbd9: out: nbd_ioctl pid=(24427) cmd=clear-que(0xab05)
> arg=140267460190640
> [122323.070530] nbd9: in: nbd_ioctl pid=(24427) cmd=disconnect(0xab08)
> arg=140267460190640
> [122323.070532] nbd9: NBD_DISCONNECT
> [122323.070555] nbd9: out: nbd_ioctl pid=(24427)
> cmd=disconnect(0xab08) arg=140267460190640
> [122323.070558] nbd9: in: nbd_ioctl pid=(24427) cmd=clear-sock(0xab04)
> arg=140267460190640
> [122323.070561] nbd9: out: nbd_ioctl pid=(24427)
> cmd=clear-sock(0xab04) arg=140267460190640
> [122323.070667] nbd9: Receive control failed (result -32)
> [122323.070710] nbd9: queue cleared
> [122323.071186] nbd9: out: nbd_ioctl pid=(24396) cmd=do-it(0xab03)
> arg=139925697560832
> [122323.071194] nbd9: in: nbd_ioctl pid=(24396) cmd=clear-que(0xab05) arg=32
> [122323.071197] nbd9: out: nbd_ioctl pid=(24396) cmd=clear-que(0xab05) arg=32
> [122323.071200] nbd9: in: nbd_ioctl pid=(24396) cmd=clear-sock(0xab04) arg=32
> [122323.071203] nbd9: out: nbd_ioctl pid=(24396) cmd=clear-sock(0xab04) arg=32
> 
> As I understand it, lock_kernel is implemented using a semaphore which
> allows a single process to invoke lock_kernel more than once without
> locking up. However, as can be seen above (timestamp 122322.651178), pid
> 24396 invokes the NBD_DO_IT ioctl. This calls lock_kernel and starts a
> kernel thread to do the heavy lifting of nbd. Later on, pid 24427 comes
> along, wanting to shut down the connection. It works. I don't understand
> how it makes it past the lock_kernel() call. There's only one instance
> of each of lock_kernel and unlock_kernel in the nbd driver, so I'm
> fairly certain unlock_kernel isn't being called (at least not by nbd
> directly).
> 
> With the mutex code introduced by
> 2a48fc0ab24241755dc93bfd4f01d68efab47f5a but with my extra printk's, this
> is what I get:
> 
> [125474.139952] nbd9: in: nbd_ioctl pid=(26015) cmd=set-blksize(0xab01) arg=1024
> [125474.139963] nbd9: out: nbd_ioctl pid=(26015)
> cmd=set-blksize(0xab01) arg=1024
> [125474.139971] nbd9: in: nbd_ioctl pid=(26015)
> cmd=set-size-blocks(0xab07) arg=153600
> [125474.139979] nbd9: out: nbd_ioctl pid=(26015)
> cmd=set-size-blocks(0xab07) arg=153600
> [125474.139986] nbd9: in: nbd_ioctl pid=(26015) cmd=clear-sock(0xab04)
> arg=153600
> [125474.139994] nbd9: out: nbd_ioctl pid=(26015)
> cmd=clear-sock(0xab04) arg=153600
> [125474.140001] nbd9: in: nbd_ioctl pid=(26015) cmd=set-sock(0xab00) arg=3
> [125474.140036] nbd9: out: nbd_ioctl pid=(26015) cmd=set-sock(0xab00) arg=3
> [125474.140660] nbd9: in: nbd_ioctl pid=(26015) cmd=do-it(0xab03)
> arg=140041726828800
> [125474.142255]  nbd9: unknown partition table
> [125475.604613] nbd9: in: nbd_ioctl pid=(26047) cmd=clear-que(0xab05)
> arg=139640717427120
> 
> ...and then it sits there until I kill -9 26015.  This makes sense to
> me, though. The mutex is held by 26015, so of course 26047 can't grab
> it, too. The mystery is why it works with lock_kernel. It doesn't
> release the kernel explicitly, and it doesn't (explicitly, at least)
> call schedule().

Thanks for looking into this, Soren.

As Dave just explained to me, BKL is released when you sleep :)  I
assume that's how it gets released around the kthread_create().  I
*think* you're right that the new mutex is superfluous, but I'd like to
look through the ioctl code and make sure there's no shared state which
we need to protect.  I don't see how there could be, or rather, if there
were, then it was broken before.

In the meantime, could you please post your patch?  (Is it the
three-liner I'd expect?)

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/