lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180208140833.lpr4yjn7g3v3cdy3@quack2.suse.cz>
Date:   Thu, 8 Feb 2018 15:08:33 +0100
From:   Jan Kara <jack@...e.cz>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Jan Kara <jack@...e.cz>, Andi Kleen <ak@...ux.intel.com>,
        syzbot <syzbot+283c3c447181741aea28@...kaller.appspotmail.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>, jlayton@...hat.com,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Ingo Molnar <mingo@...nel.org>, rgoldwyn@...e.com,
        syzkaller-bugs@...glegroups.com, linux-fsdevel@...r.kernel.org
Subject: Re: INFO: task hung in sync_blockdev

On Thu 08-02-18 14:28:08, Dmitry Vyukov wrote:
> On Thu, Feb 8, 2018 at 10:28 AM, Jan Kara <jack@...e.cz> wrote:
> > On Wed 07-02-18 07:52:29, Andi Kleen wrote:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
> >> > __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
> >> > 1 lock held by blkid/19199:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> >  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
> >> > n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
> >> > 1 lock held by syz-executor5/19330:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> > 1 lock held by syz-executor5/19331:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >>
> >> It seems multiple processes deadlocked on the bd_mutex.
> >> Unfortunately there's no backtrace for the lock acquisitions,
> >> so it's hard to see the exact sequence.
> >
> > Well, all in the report points to a situation where some IO was submitted
> > to the block device and never completed (more exactly it took longer than
> > those 120s to complete that IO). It would need more digging into the
> > syzkaller program to find out what kind of device that was and possibly why
> > the IO took so long to complete...
> 
> 
> Would a traceback of all task stacks help in this case?
> What I've seen in several "task hung" reports is that the CPU
> traceback is not showing anything useful. So perhaps it should be
> changed to task traceback? Or it would not help either?

Task stack traceback for all tasks (usually only tasks in D state - i.e.
sysrq-w - are enough actually) would definitely help for debugging
deadlocks on sleeping locks. For this particular case I'm not sure if it
would help or not since it is quite possible the IO is just sitting in some
queue never getting processed due to some racing syzkaller process tearing
down the device in the wrong moment or something like that... Such case is
very difficult to debug without full kernel crashdump of the hung kernel
(or a reproducer for that matter) and even with that it is usually rather
time consuming. But for the deadlocks which do occur more frequently it
would be probably worth the time so it would be nice if such option was
eventually available.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ