lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181106052051.GD6416@thunk.org>
Date:   Tue, 6 Nov 2018 00:20:51 -0500
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Jan Kara <jack@...e.cz>
Cc:     linux-ext4@...r.kernel.org, Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCH] jbd2: Avoid long hold times of j_state_lock while
 committing a transaction

On Tue, Oct 16, 2018 at 11:42:48AM +0200, Jan Kara wrote:
> We can hold j_state_lock for writing at the beginning of
> jbd2_journal_commit_transaction() for a rather long time (reportedly for
> 30 ms) due cleaning revoke bits of all revoked buffers under it. The
> handling of revoke tables as well as cleaning of t_reserved_list, and
> checkpoint lists does not need j_state_lock for anything. Furthermore
> the transaction is in T_LOCKED state and we waited for all outstanding
> handles so nobody is going to be adding anything to the transaction.
> 
> Just drop the lock for unnecessary operations.
> 
> Reported-and-tested-by: Adrian Hunter <adrian.hunter@...el.com>
> Suggested-by: "Theodore Y. Ts'o" <tytso@....edu>
> Signed-off-by: Jan Kara <jack@...e.cz>

Unfortunately, this patch is causing a regression in generic/241 in
the dioread_nolock configuration.   To reproduce:

% gce-xftests -c dioread_nolock generic/241

which will result in the decoded stack trace:

[   42.926455] run fstests generic/241 at 2018-11-06 00:12:30
[   51.091678] JBD2: assertion failure: h_type=11 h_line_no=4921 block_no=524337 jlist=3
[   51.101179] ------------[ cut here ]------------
[   51.106191] kernel BUG at fs/jbd2/transaction.c:1372!
[   51.111716] invalid opcode: 0000 [#1] SMP PTI
[   51.116198] CPU: 0 PID: 177 Comm: kworker/u4:2 Not tainted 4.19.0-rc6-xfstests-00022-g895dc4ba4f98 #730
[   51.125809] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[   51.135375] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
[   51.141433] RIP: 0010:jbd2_journal_dirty_metadata.cold.7 (??:?) 
[ 51.147867] Code: c1 ea 0c 66 c1 ee 04 0f b7 d2 40 0f b6 f6 e8 c4 64 de ff 4d 39 6c 24 28 0f 85 22 f4 ff ff 41 83 7c 24 0c 01 0f 84 16 f4 ff ff <0f> 0b 4d 85 c9 74 2c 41 8b 41 08 44 8b 41 08 49 8d b7 c8 05 00 00
All code
========
   0:	c1 ea 0c             	shr    $0xc,%edx
   3:	66 c1 ee 04          	shr    $0x4,%si
   7:	0f b7 d2             	movzwl %dx,%edx
   a:	40 0f b6 f6          	movzbl %sil,%esi
   e:	e8 c4 64 de ff       	callq  0xffffffffffde64d7
  13:	4d 39 6c 24 28       	cmp    %r13,0x28(%r12)
  18:	0f 85 22 f4 ff ff    	jne    0xfffffffffffff440
  1e:	41 83 7c 24 0c 01    	cmpl   $0x1,0xc(%r12)
  24:	0f 84 16 f4 ff ff    	je     0xfffffffffffff440
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	4d 85 c9             	test   %r9,%r9
  2f:	74 2c                	je     0x5d
  31:	41 8b 41 08          	mov    0x8(%r9),%eax
  35:	44 8b 41 08          	mov    0x8(%rcx),%r8d
  39:	49 8d b7 c8 05 00 00 	lea    0x5c8(%r15),%rsi

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	4d 85 c9             	test   %r9,%r9
   5:	74 2c                	je     0x33
   7:	41 8b 41 08          	mov    0x8(%r9),%eax
   b:	44 8b 41 08          	mov    0x8(%rcx),%r8d
   f:	49 8d b7 c8 05 00 00 	lea    0x5c8(%r15),%rsi
[   51.166848] RSP: 0018:ffffb08481573c60 EFLAGS: 00010202
[   51.172195] RAX: 0000000000000049 RBX: ffff8e72c98e6c98 RCX: 0000000000000000
[   51.179737] RDX: 0000000000000000 RSI: ffff8e72dfbd54c8 RDI: ffff8e72dfbd54c8
[   51.187020] RBP: ffff8e72d3247600 R08: 0000001969fa2eee R09: 0000000000000000
[   51.194639] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e72c21b22a0
[   51.201985] R13: ffff8e72c224ce00 R14: 0000000000000000 R15: ffff8e72d2dc9000
[   51.209795] FS:  0000000000000000(0000) GS:ffff8e72dfa00000(0000) knlGS:0000000000000000
[   51.218109] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.224006] CR2: 00007f67fc9bc420 CR3: 0000000174612001 CR4: 00000000001606f0
[   51.232026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   51.240026] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   51.247810] Call Trace:
[   51.250394] __ext4_handle_dirty_metadata (include/linux/jbd2.h:1501 fs/ext4/ext4_jbd2.c:271) 
[   51.255691] ext4_do_update_inode.isra.11 (fs/ext4/inode.c:5324) 
[   51.261187] ? __ext4_journal_get_write_access (fs/ext4/ext4_jbd2.c:170) 
[   51.266943] ? ext4_convert_unwritten_extents (fs/ext4/extents.c:4951) 
[   51.272984] ext4_mark_iloc_dirty (fs/ext4/inode.c:5849) 
[   51.277389] ext4_mark_inode_dirty (fs/ext4/inode.c:6028) 
[   51.282260] ext4_convert_unwritten_extents (fs/ext4/extents.c:4951) 
[   51.287900] ext4_end_io_rsv_work (fs/ext4/page-io.c:161 fs/ext4/page-io.c:234 fs/ext4/page-io.c:248) 
[   51.292523] process_one_work (arch/x86/include/asm/jump_label.h:36 include/linux/jump_label.h:142 include/trace/events/workqueue.h:114 kernel/workqueue.c:2158) 
[   51.296937] worker_thread (include/linux/compiler.h:188 include/linux/list.h:203 kernel/workqueue.c:2297) 
[   51.301379] ? process_one_work (kernel/workqueue.c:2239) 
[   51.305700] kthread (kernel/kthread.c:246) 
[   51.309048] ? kthread_park (kernel/kthread.c:206) 
[   51.312836] ret_from_fork (arch/x86/entry/entry_64.S:419) 
[   51.316531] Modules linked in: sg
[   51.320170] ---[ end trace 14f78710c7a5d90b ]---
[   51.325121] RIP: 0010:jbd2_journal_dirty_metadata.cold.7 (??:?) 
[ 51.331516] Code: c1 ea 0c 66 c1 ee 04 0f b7 d2 40 0f b6 f6 e8 c4 64 de ff 4d 39 6c 24 28 0f 85 22 f4 ff ff 41 83 7c 24 0c 01 0f 84 16 f4 ff ff <0f> 0b 4d 85 c9 74 2c 41 8b 41 08 44 8b 41 08 49 8d b7 c8 05 00 00
All code
========
   0:	c1 ea 0c             	shr    $0xc,%edx
   3:	66 c1 ee 04          	shr    $0x4,%si
   7:	0f b7 d2             	movzwl %dx,%edx
   a:	40 0f b6 f6          	movzbl %sil,%esi
   e:	e8 c4 64 de ff       	callq  0xffffffffffde64d7
  13:	4d 39 6c 24 28       	cmp    %r13,0x28(%r12)
  18:	0f 85 22 f4 ff ff    	jne    0xfffffffffffff440
  1e:	41 83 7c 24 0c 01    	cmpl   $0x1,0xc(%r12)
  24:	0f 84 16 f4 ff ff    	je     0xfffffffffffff440
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	4d 85 c9             	test   %r9,%r9
  2f:	74 2c                	je     0x5d
  31:	41 8b 41 08          	mov    0x8(%r9),%eax
  35:	44 8b 41 08          	mov    0x8(%rcx),%r8d
  39:	49 8d b7 c8 05 00 00 	lea    0x5c8(%r15),%rsi

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	4d 85 c9             	test   %r9,%r9
   5:	74 2c                	je     0x33
   7:	41 8b 41 08          	mov    0x8(%r9),%eax
   b:	44 8b 41 08          	mov    0x8(%rcx),%r8d
   f:	49 8d b7 c8 05 00 00 	lea    0x5c8(%r15),%rsi
[   51.350792] RSP: 0018:ffffb08481573c60 EFLAGS: 00010202
[   51.356407] RAX: 0000000000000049 RBX: ffff8e72c98e6c98 RCX: 0000000000000000
[   51.363778] RDX: 0000000000000000 RSI: ffff8e72dfbd54c8 RDI: ffff8e72dfbd54c8
[   51.371485] RBP: ffff8e72d3247600 R08: 0000001969fa2eee R09: 0000000000000000
[   51.378856] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e72c21b22a0
[   51.386323] R13: ffff8e72c224ce00 R14: 0000000000000000 R15: ffff8e72d2dc9000
[   51.393775] FS:  0000000000000000(0000) GS:ffff8e72dfa00000(0000) knlGS:0000000000000000
[   51.402416] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.408605] CR2: 00007f67fc9bc420 CR3: 0000000174612001 CR4: 00000000001606f0
[   51.416057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   51.423596] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   51.431668] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:34
[   51.441028] in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: kworker/u4:2
[   51.448399] INFO: lockdep is turned off.
[   51.452644] CPU: 0 PID: 177 Comm: kworker/u4:2 Tainted: G      D           4.19.0-rc6-xfstests-00022-g895dc4ba4f98 #730
[   51.464004] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[   51.473697] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
[   51.479745] Call Trace:
[   51.482499] dump_stack (lib/dump_stack.c:115) 
[   51.486326] ___might_sleep.cold.12 (kernel/sched/core.c:6145) 
[   51.491043] exit_signals (include/linux/percpu-rwsem.h:36 include/linux/percpu-rwsem.h:59 include/linux/cgroup-defs.h:691 kernel/signal.c:2598) 
[   51.494749] do_exit (kernel/exit.c:818) 
[   51.498387] ? process_one_work (kernel/workqueue.c:2239) 
[   51.502698] ? kthread (kernel/kthread.c:246) 
[   51.506218] rewind_stack_do_exit (??:?) 
[   51.510639] note: kworker/u4:2[177] exited with preempt_count 1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ