linux-ext4 - fsync stuck at jbd2_log_wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <COL130-W652BF73D365EBB4F3DDF69D67E0@phx.gbl>
Date:	Wed, 12 Aug 2015 09:57:51 -0700
From:	Roy Yang <bsdnet@...look.com>
To:	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: fsync stuck at jbd2_log_wait_commit on NVMe devices

Hi EXT4 Experts, 

We came cross an ext4 journaling problem, and I did some googling on high I/O wait, 
found a lot of links, but none of them is the same.

The backtrace when system is hung is :

Aug 7 11:31:33 2f4526ab kernel: "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
...
Aug 7 11:31:33 2f4526ab kernel: Call Trace:
Aug 7 11:31:33 2f4526ab kernel: [<ffffffff816096a9>] schedule+0x29/0x70
Aug 7 11:31:33 2f4526ab kernel: [<ffffffffa031d785>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Aug 7 11:31:33 2f4526ab kernel: [<ffffffff81098230>] ? wake_up_bit+0x30/0x30
Aug 7 11:31:33 2f4526ab kernel: [<ffffffffa031fe32>] jbd2_complete_transaction+0x52/0xa0 [jbd2]
Aug 7 11:31:33 2f4526ab kernel: [<ffffffffa03339f2>] ext4_sync_file+0x292/0x320 [ext4]
Aug 7 11:31:33 2f4526ab kernel: [<ffffffff811f6ba5>] do_fsync+0x65/0xa0
Aug 7 11:31:33 2f4526ab kernel: [<ffffffff811f6e70>] SyS_fsync+0x10/0x20
Aug 7 11:31:33 2f4526ab kernel: [<ffffffff81614209>] system_call_fastpath+0x16/0x1b

At application layer, we use a lot of fsync APIs. When I/O load is heavy, we saw a lot of I/Os can't finish within 120 seconds
because JBD2 can't finish the journaling. We also see when this happens, the JBD2 info shows very high on the following two items:

$ cat /proc/fs/jbd2/nvme0n1/info
85162438 transactions (85162436 requested), each up to 8192 blocks
average: 
  0ms waiting for transaction
  0ms request delay
  1ms running transaction                 <==== this will be very high, 3 to 6 seconds 
  0ms transaction was being locked
  0ms flushing data (in ordered mode)
  0ms logging transaction
  153us average transaction commit time     <=== this will be very high, half to one second
  35 handles per transaction
  2 blocks per transaction
  3 logged blocks per transaction

Our environment :
$ mount 
/dev/nvme0n1 on /logs type ext4 (rw,noatime,data=ordered)

$ uname -a
Linux a69ac961 3.10.0-229.7.2.el7.x86_64 #1 SMP Tue Jun 23 22:06:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/redhat-release 
CentOS Linux release 7.1.1503 (Core) 


$ sudo tune2fs  -l /dev/nvme0n1
tune2fs 1.42.9 (28-Dec-2013)
Filesystem volume name:   cty-data
Last mounted on:          <not available>
Filesystem UUID:          227fde0d-a3ae-4236-9d32-46ef0091b3b3
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              91062272
Block count:              364226560
Reserved block count:     3642265
Free blocks:              126192987
Free inodes:              90916973
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Wed Aug  5 13:00:41 2015
Last mount time:          Mon Aug 10 18:32:59 2015
Last write time:          Mon Aug 10 18:32:59 2015
Mount count:              6
Maximum mount count:      -1
Last checked:             Wed Aug  5 13:00:41 2015
Check interval:           0 (<none>)
Lifetime writes:          135 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      c8f8aa7c-583b-4251

Any suggestions  or hints how to debug this problem will be appreciated!

Thank you very much, 

Roy
© 2015 Microsoft Terms Privacy & cookies Developers English (United States) 

          		 	   		  --
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html