lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 11 Feb 2016 09:16:42 -0800
From:	"Darrick J. Wong" <darrick.wong@...cle.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Eric Whitney <enwlinux@...il.com>, linux-ext4@...r.kernel.org,
	tytso@....edu
Subject: Re: 4.5-rc1 panic when running generic/300 with dioread_nolock

On Thu, Feb 11, 2016 at 10:25:33AM +0100, Jan Kara wrote:
> On Wed 10-02-16 19:23:44, Eric Whitney wrote:
> > I've been seeing a panic and other signs of locking trouble when running
> > xfstest generic/300 with xfstests-bld's x86-64 test appliance in the
> > dioread_nolock test configuration.  This began with the 4.5-rc1 kernel and has
> > continued through 4.5-rc3.  It happens in more than 50% of my test runs and
> > is accompanied by an inconsistent lock state warning.  It does not occur in
> > any other xfstest-bld test configuration.
> > 
> > The panic bisects to "ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag"
> > (2dcba4781f).  Reverting this commit (requires minor merging) in a 4.5-rc1
> > kernel eliminates the panic.
> > 
> > Typical dmesg output follows.
> 
> Uh, it seems like we fail to set buffer_defer_completion() when creating
> unwritten extent for direct IO so ext4_end_io_dio() gets called directly
> from IRQ instead of from the workqueue. Not sure why that happens because
> _ext4_get_block() seems to set buffer_defer_completion whenever we return

But... set_buffer_defer_completion() only happens if there's an ioend attached
to the inode (inode.c around line 727):

if (io_end && io_end->flag & EXT4_IO_END_UNWRITTEN)
	set_buffer_defer_completion(bh);

AFAIK, there isn't be an ioend unless this is an AIO DIO, because the only
place we call ext4_inode_aio_set() with a non-NULL ioend is around line 3285
in inode.c, which only executes if (!is_sync_kiocb()), which is only true
in the AIO case.

So since this isn't an AIO DIO to an unwritten extent, we don't mark the bh
for deferred completion, so the dio completion can happen in irq context.
I guess we could make the ioend happen for any unwritten dio...

...or revert 2dcba4781f. <shrug>

(No idea if this is what's /supposed/ to happen for ext4...)

--D

> unwritten extent. I don't have time to debug this right now so it may be
> best to try bisecting the breakage if you have time to do that...
> 
> 								Honza
> 
> > 
> > Thanks,
> > Eric
> > 
> > generic/300 7s ...	[16:06:59][    3.320917] run fstests generic/300 at 2016-02-05 16:06:59
> > [    4.078053] 
> > [    4.078330] =================================
> > [    4.079036] [ INFO: inconsistent lock state ]
> > [    4.079407] 4.4.0-rc4+ #7 Not tainted
> > [    4.079724] ---------------------------------
> > [    4.080007] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage.
> > [    4.080007] fio/904 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > [    4.080007]  (&journal->j_state_lock){+?++..}, at: [<ffffffff812f9bce>] start_this_handle+0xae/0x450
> > [    4.080007] {HARDIRQ-ON-W} state was registered at:
> > [    4.080007]   [<ffffffff810fc766>] __lock_acquire+0x896/0x1fb0
> > [    4.080007]   [<ffffffff810fe756>] lock_acquire+0xb6/0x130
> > [    4.080007]   [<ffffffff81724df1>] _raw_write_lock+0x31/0x40
> > [    4.080007]   [<ffffffff812c171d>] ext4_init_journal_params+0x4d/0xc0
> > [    4.080007]   [<ffffffff812d152e>] ext4_fill_super+0x236e/0x31c0
> > [    4.080007]   [<ffffffff8121cfe5>] mount_bdev+0x185/0x1b0
> > [    4.080007]   [<ffffffff812c1285>] ext4_mount+0x15/0x20
> > [    4.080007]   [<ffffffff8121d8a8>] mount_fs+0x38/0x170
> > [    4.080007]   [<ffffffff8123ce1b>] vfs_kern_mount+0x6b/0x160
> > [    4.080007]   [<ffffffff8123fda8>] do_mount+0x218/0xe70
> > [    4.080007]   [<ffffffff81240cfb>] SyS_mount+0x8b/0xd0
> > [    4.080007]   [<ffffffff81d49395>] mount_block_root+0xf6/0x284
> > [    4.080007]   [<ffffffff81d496ad>] mount_root+0x67/0x6a
> > [    4.080007]   [<ffffffff81d49817>] prepare_namespace+0x167/0x19f
> > [    4.080007]   [<ffffffff81d491c5>] kernel_init_freeable+0x268/0x278
> > [    4.080007]   [<ffffffff8171800e>] kernel_init+0xe/0xe0
> > [    4.080007]   [<ffffffff8172561f>] ret_from_fork+0x3f/0x70
> > [    4.080007] irq event stamp: 38034
> > [    4.080007] hardirqs last  enabled at (38033): [<ffffffff811adac7>] context_tracking_exit.part.6+0x37/0x60
> > [    4.080007] hardirqs last disabled at (38034): [<ffffffff81725d8c>] common_interrupt+0x8c/0x91
> > [    4.080007] softirqs last  enabled at (36084): [<ffffffff810ae829>] __do_softirq+0x2a9/0x340
> > [    4.080007] softirqs last disabled at (36063): [<ffffffff810aea8e>] irq_exit+0xbe/0xc0
> > [    4.080007] 
> > [    4.080007] other info that might help us debug this:
> > [    4.080007]  Possible unsafe locking scenario:
> > [    4.080007] 
> > [    4.080007]        CPU0
> > [    4.080007]        ----
> > [    4.080007]   lock(&journal->j_state_lock);
> > [    4.080007]   <Interrupt>
> > [    4.080007]     lock(&journal->j_state_lock);
> > [    4.080007] 
> > [    4.080007]  *** DEADLOCK ***
> > [    4.080007] 
> > [    4.080007] 1 lock held by fio/904:
> > [    4.080007]  #0:  (&(&vblk->vqs[i].lock)->rlock){-.-...}, at: [<ffffffff81588193>] virtblk_done+0x43/0xe0
> > [    4.080007] 
> > [    4.080007] stack backtrace:
> > [    4.080007] CPU: 0 PID: 904 Comm: fio Not tainted 4.4.0-rc4+ #7
> > [    4.080007] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> > [    4.080007]  ffffffff826152e0 ffff88007fc039b0 ffffffff8138fb82 ffff88007a18c880
> > [    4.080007]  ffff88007fc03a00 ffffffff811aed8c 0000000000000000 0000000000000000
> > [    4.080007]  0000000000000001 0000000000000001 ffff88007a18d0d8 ffff88007a18c880
> > [    4.080007] Call Trace:
> > [    4.080007]  <IRQ>  [<ffffffff8138fb82>] dump_stack+0x44/0x62
> > [    4.080007]  [<ffffffff811aed8c>] print_usage_bug+0x1f3/0x208
> > [    4.080007]  [<ffffffff810fb889>] mark_lock+0x649/0x6c0
> > [    4.080007]  [<ffffffff810fa9e0>] ? print_shortest_lock_dependencies+0x1d0/0x1d0
> > [    4.080007]  [<ffffffff810fc8ee>] __lock_acquire+0xa1e/0x1fb0
> > [    4.080007]  [<ffffffff810f7565>] ? cpuacct_charge+0x5/0x190
> > [    4.080007]  [<ffffffff810f983d>] ? __lock_is_held+0x4d/0x70
> > [    4.080007]  [<ffffffff810f983d>] ? __lock_is_held+0x4d/0x70
> > [    4.080007]  [<ffffffff810fe756>] lock_acquire+0xb6/0x130
> > [    4.080007]  [<ffffffff812f9bce>] ? start_this_handle+0xae/0x450
> > [    4.080007]  [<ffffffff81724b34>] _raw_read_lock+0x34/0x50
> > [    4.080007]  [<ffffffff812f9bce>] ? start_this_handle+0xae/0x450
> > [    4.080007]  [<ffffffff812f9bce>] start_this_handle+0xae/0x450
> > [    4.080007]  [<ffffffff812f9700>] ? new_handle+0x20/0x60
> > [    4.080007]  [<ffffffff812fa10f>] jbd2__journal_start+0xbf/0x190
> > [    4.080007]  [<ffffffff812dd171>] ? ext4_convert_unwritten_extents+0xb1/0x210
> > [    4.080007]  [<ffffffff812df221>] __ext4_journal_start_sb+0x71/0xd0
> > [    4.080007]  [<ffffffff812dd171>] ext4_convert_unwritten_extents+0xb1/0x210
> > [    4.080007]  [<ffffffff812b4dee>] ext4_put_io_end+0x4e/0x70
> > [    4.080007]  [<ffffffff812ab13a>] ext4_end_io_dio+0x2a/0x30
> > [    4.080007]  [<ffffffff81258bcb>] dio_complete+0xcb/0x1b0
> > [    4.080007]  [<ffffffff81258d2f>] dio_bio_end_aio+0x7f/0x130
> > [    4.080007]  [<ffffffff8135d06f>] bio_endio+0x3f/0x60
> > [    4.080007]  [<ffffffff813647c7>] blk_update_request+0x97/0x320
> > [    4.080007]  [<ffffffff8136e85a>] blk_mq_end_request+0x1a/0x70
> > [    4.080007]  [<ffffffff8158833f>] virtblk_request_done+0x3f/0x70
> > [    4.080007]  [<ffffffff8136e92b>] __blk_mq_complete_request+0x7b/0xf0
> > [    4.080007]  [<ffffffff8136e9bc>] blk_mq_complete_request+0x1c/0x20
> > [    4.080007]  [<ffffffff815881b4>] virtblk_done+0x64/0xe0
> > [    4.080007]  [<ffffffff81432ee1>] vring_interrupt+0x31/0x50
> > [    4.080007]  [<ffffffff81107ac1>] handle_irq_event_percpu+0x81/0x1b0
> > [    4.080007]  [<ffffffff81107c29>] handle_irq_event+0x39/0x60
> > [    4.080007]  [<ffffffff8110ac9f>] handle_edge_irq+0x6f/0x150
> > [    4.080007]  [<ffffffff81056ded>] handle_irq+0x1d/0x30
> > [    4.080007]  [<ffffffff81727ab1>] do_IRQ+0x61/0x120
> > [    4.080007]  [<ffffffff81725d91>] common_interrupt+0x91/0x91
> > [    4.080007]  <EOI>  [<ffffffff811adacc>] ? context_tracking_exit.part.6+0x3c/0x60
> > [    4.080007]  [<ffffffff811adb0d>] context_tracking_exit+0x1d/0x20
> > [    4.080007]  [<ffffffff810026bf>] enter_from_user_mode+0x1f/0x50
> > [    4.080007]  [<ffffffff810027b6>] syscall_trace_enter_phase1+0xc6/0x130
> > [    4.080007]  [<ffffffff81002017>] ? trace_hardirqs_on_thunk+0x17/0x19
> > [    4.080007]  [<ffffffff81725335>] tracesys+0xd/0x44
> > [    4.080007] ------------[ cut here ]------------
> > [    4.080007] kernel BUG at fs/buffer.c:1281!
> > [    4.080007] invalid opcode: 0000 [#1] SMP 
> > [    4.080007] Modules linked in:
> > [    4.080007] CPU: 0 PID: 904 Comm: fio Not tainted 4.4.0-rc4+ #7
> > [    4.080007] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> > [    4.080007] task: ffff88007a18c880 ti: ffff880079144000 task.ti: ffff880079144000
> > [    4.080007] RIP: 0010:[<ffffffff81252f41>]  [<ffffffff81252f41>] __find_get_block+0x121/0x130
> > [    4.080007] RSP: 0018:ffff88007fc03a78  EFLAGS: 00010046
> > [    4.080007] RAX: 0000000000000086 RBX: 0000000000018980 RCX: 0000000002400048
> > [    4.080007] RDX: 0000000000001000 RSI: 000000000000dde0 RDI: ffff88007c06ca80
> > [    4.080007] RBP: ffff88007fc03aa8 R08: 0000000000000001 R09: 0000000000000000
> > [    4.080007] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007c06ca80
> > [    4.080007] R13: 000000000000dde0 R14: 0000000000001000 R15: 0000000002400048
> > [    4.080007] FS:  00007fcfc0cd4740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> > [    4.080007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    4.080007] CR2: 000000000258c018 CR3: 000000007a031000 CR4: 00000000000006f0
> > [    4.080007] Stack:
> > [    4.080007]  ffffffff81253223 0000000000018980 ffff880079c86af0 000000000000dde0
> > [    4.080007]  ffff88007a088900 0000000000000001 ffff88007fc03b00 ffffffff812d68d6
> > [    4.080007]  0000000002408040 000003867fc03b00 ffffffff81841db0 ffffffff00000000
> > [    4.080007] Call Trace:
> > [    4.080007]  <IRQ> 
> > [    4.080007]  [<ffffffff81253223>] ? __getblk_gfp+0x23/0x60
> > [    4.080007]  [<ffffffff812d68d6>] __read_extent_tree_block+0x46/0x200
> > [    4.080007]  [<ffffffff812d7b89>] ext4_find_extent+0x139/0x320
> > [    4.080007]  [<ffffffff812dc2a7>] ext4_ext_map_blocks+0x87/0xdd0
> > [    4.080007]  [<ffffffff812adbaa>] ? ext4_map_blocks+0xea/0x4b0
> > [    4.080007]  [<ffffffff812adbce>] ext4_map_blocks+0x10e/0x4b0
> > [    4.080007]  [<ffffffff812dd171>] ? ext4_convert_unwritten_extents+0xb1/0x210
> > [    4.080007]  [<ffffffff812dd194>] ext4_convert_unwritten_extents+0xd4/0x210
> > [    4.080007]  [<ffffffff812b4dee>] ext4_put_io_end+0x4e/0x70
> > [    4.080007]  [<ffffffff812ab13a>] ext4_end_io_dio+0x2a/0x30
> > [    4.080007]  [<ffffffff81258bcb>] dio_complete+0xcb/0x1b0
> > [    4.080007]  [<ffffffff81258d2f>] dio_bio_end_aio+0x7f/0x130
> > [    4.080007]  [<ffffffff8135d06f>] bio_endio+0x3f/0x60
> > [    4.080007]  [<ffffffff813647c7>] blk_update_request+0x97/0x320
> > [    4.080007]  [<ffffffff8136e85a>] blk_mq_end_request+0x1a/0x70
> > [    4.080007]  [<ffffffff8158833f>] virtblk_request_done+0x3f/0x70
> > [    4.080007]  [<ffffffff8136e92b>] __blk_mq_complete_request+0x7b/0xf0
> > [    4.080007]  [<ffffffff8136e9bc>] blk_mq_complete_request+0x1c/0x20
> > [    4.080007]  [<ffffffff815881b4>] virtblk_done+0x64/0xe0
> > [    4.080007]  [<ffffffff81432ee1>] vring_interrupt+0x31/0x50
> > [    4.080007]  [<ffffffff81107ac1>] handle_irq_event_percpu+0x81/0x1b0
> > [    4.080007]  [<ffffffff81107c29>] handle_irq_event+0x39/0x60
> > [    4.080007]  [<ffffffff8110ac9f>] handle_edge_irq+0x6f/0x150
> > [    4.080007]  [<ffffffff81056ded>] handle_irq+0x1d/0x30
> > [    4.080007]  [<ffffffff81727ab1>] do_IRQ+0x61/0x120
> > [    4.080007]  [<ffffffff81725d91>] common_interrupt+0x91/0x91
> > [    4.080007]  <EOI> 
> > [    4.080007]  [<ffffffff811adacc>] ? context_tracking_exit.part.6+0x3c/0x60
> > [    4.080007]  [<ffffffff811adb0d>] context_tracking_exit+0x1d/0x20
> > [    4.080007]  [<ffffffff810026bf>] enter_from_user_mode+0x1f/0x50
> > [    4.080007]  [<ffffffff810027b6>] syscall_trace_enter_phase1+0xc6/0x130
> > [    4.080007]  [<ffffffff81002017>] ? trace_hardirqs_on_thunk+0x17/0x19
> > [    4.080007]  [<ffffffff81725335>] tracesys+0xd/0x44
> > [    4.080007] Code: de ff d0 49 8b 04 24 48 85 c0 75 e9 eb b3 4c 89 e6 4c 89 ef e8 41 ed ff ff 48 85 c0 48 89 c3 74 b5 48 89 c7 e8 e1 fc ff ff eb ab <0f> 0b 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 
> > [    4.080007] RIP  [<ffffffff81252f41>] __find_get_block+0x121/0x130
> > [    4.080007]  RSP <ffff88007fc03a78>
> > [    4.080007] ---[ end trace 9bbf559562132712 ]---
> > [    4.080007] Kernel panic - not syncing: Fatal exception in interrupt
> > [    4.080007] Kernel Offset: disabled
> > [    4.080007] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
> > 
> > 
> > 
> > 
> > 
> -- 
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ